Open grob opened 2 years ago
Although #243 is merged, srcset-URLs with commas in them are still not parsed/rewritten correctly, see https://web.archive.org/web/*/https://orf.at/ for example.
The original URLs used in srcset attributes look like this: https://assets.orf.at/mims/2022/03/26/crops/w=875,q=90/1204287_opener_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=bad56ac4b6df02892d3bd744c8e9494d4fd72b50.
https://assets.orf.at/mims/2022/03/26/crops/w=875,q=90/1204287_opener_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=bad56ac4b6df02892d3bd744c8e9494d4fd72b50
a complete srcset example used in this site:
<source media="(max-width: 600px)" srcset="https://assets.orf.at/mims/2022/03/26/crops/w=800,h=450,q=70/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=baff281a0ee94f81ed19d576f7eff4f0ed6e44c9 800w, https://assets.orf.at/mims/2022/03/26/crops/w=1280,h=720,q=60/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=735e42760bcc348a2afed7dde20a17bf2857caaf 1280w">
results in (see here):
<source media="(max-width: 600px)" srcset="https://web.archive.org/web/20220114214021im_/https://assets.orf.at/mims/2022/03/26/crops/w=800, /web/20220114214021im_/https://orf.at/stories/3243632/h=450, /web/20220114214021im_/https://orf.at/stories/3243632/q=70/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=baff281a0ee94f81ed19d576f7eff4f0ed6e44c9 800w, https://web.archive.org/web/20220114214021im_/https://assets.orf.at/mims/2022/03/26/crops/w=1280, /web/20220114214021im_/https://orf.at/stories/3243632/h=720, /web/20220114214021im_/https://orf.at/stories/3243632/q=60/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=735e42760bcc348a2afed7dde20a17bf2857caaf 1280w">
As this is about rewriting this is likely an issue with the (closed-source) Wayback replay software not with the Heritrix web crawler.
Although #243 is merged, srcset-URLs with commas in them are still not parsed/rewritten correctly, see https://web.archive.org/web/*/https://orf.at/ for example.
The original URLs used in srcset attributes look like this:
https://assets.orf.at/mims/2022/03/26/crops/w=875,q=90/1204287_opener_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=bad56ac4b6df02892d3bd744c8e9494d4fd72b50
.a complete srcset example used in this site:
results in (see here):