webrecorder / wabac.js

wabac.js - Web Archive Browsing Augmentation Client
https://replayweb.page
GNU Affero General Public License v3.0
96 stars 16 forks source link

Pages with a Youtube embeded player are not playing the whole video anymore #181

Closed benoit74 closed 1 month ago

benoit74 commented 1 month ago

It looks like Youtube player has been significantly modified and WARC of a page with an embedded youtube video seems to not work anymore.

For instance https://tmp.kiwix.org/ci/test-warc/100r.co/crawl-100r-orca-20240528.warc.gz is working properly on replayweb.page but https://tmp.kiwix.org/ci/test-warc/100r.co/crawl-orca-20240620.warc.gz is playing only the first 4 secs of the video then failing.

I've already assembled quite a lot of details and investigation in https://github.com/openzim/zimit/issues/323

In few words, it looks like the player is not doing anymore Range Requests to grab the video, but multiple regular GET request with the range specified in a query parameter. Unfortunately the range is highly dynamic based on "I don't know which environmental factor", so replayers fails to find proper record when replaying, making adaptation of fuzzy rules insufficient to fix the problem.

ikreymer commented 1 month ago

Yes, unfortunately, the whole player has been changed, so previous rewriting injections no longer work at all. Will have a fix shortly, which probably most fail-safe for now is to disable MediaSource based playback, which allows the player to fallback to mp4 streaming. The fix being tested is injecting into youtube.com HTML pages:

<script>window.MediaSource.isTypeSupported = () => false;</script>
benoit74 commented 1 month ago

OK, thank you very much for the hint, will try this as well. What you say makes sense but far from what I could I found on my own! So double thank you ^^

ikreymer commented 1 month ago

Fixed via #182