openzim / zimit

Make a ZIM file from any Web site and surf offline!
GNU General Public License v3.0
262 stars 22 forks source link

Zimit2: Youtube videos are not working everywhere #291

Closed benoit74 closed 2 months ago

benoit74 commented 3 months ago

We have to fix the situation where Youtube videos are not working everywhere.

We typically now that they do not play in kiwix-serve on Android Firefox / Chrome (while they should) and it looks like they do not play on kiwix-serve on Windows as well: https://github.com/openzim/warc2zim/issues/206#issuecomment-2022247860

benoit74 commented 3 months ago

This is in fact a Zimit issue, and most probably has nothing to do with Zimit2. I'm transferring it to zimit repo and will give more explanations once transferred.

benoit74 commented 3 months ago

I've done some tests with zimit2 and warc2zim2 (url_handling branch from PR https://github.com/openzim/warc2zim/pull/218 but we will see it does not matter).

Browsertrix crawler is hence 1.0.0 beta-6

I ran 4 different tests:

Device / Reader A B C D
MacOS 12.7.4 - Kiwix reader opened in Firefox
MacOS 12.7.4 - Kiwix native app (3.3.0 build 145) ✅ (very slow to load) ✅ (very slow to load)
iPhone 13 (iOS 15) - Kiwix reader opened in Safari
Fairphone 4 5G (Android 13) - Kiwix reader opened in Firefox
Fairphone 4 5G (Android 13) - Kiwix reader opened in Firefox

Even if testing more readers will be important, conclusion seems pretty clear.

Conclusion

For Youtube videos (at least), we must use another userAgent than the current one.

Previous work on https://github.com/openzim/zimit/pull/229 (where we switched by default to a mandatory UA and choose to use a "desktop-like" UA) was not totally a good idea. It helped solve some problems with Python check of the URL ... but caused other issues like this one.

Now that Python check of the URL is gone, we should probably rollback most of PR 229 changes:

I also recommend to set a default --mobileDevice, so that a proper userAgent is passed (concatenated with our default userAgentSuffix) since it seems mostly mandatory for proper zimit operation, and add support for a new --noMobileDevice, which would not set the argument --mobileDevice argument in browsertrix crawler CLI call (should someone want to not set use mobileDevice ... probably rare, but priceless to implement ... probably not needed to be exposed on Zimfarm)

Then comes the question of which default mobileDevice to choose. For tests I chose Pixel 2, full list is here: https://github.com/puppeteer/puppeteer/blob/b144935789315697254880015847b2b4d151d52b/packages/puppeteer-core/src/common/Device.ts ; smaller screen might lead to situations where we are served a small asset, which is more or less what we prefer to keep ZIM size small and work on all screen size. This was my logic when I chose Pixel 2 for tests.

benoit74 commented 3 months ago

Edit: fix the test table, second device was wrong

benoit74 commented 3 months ago

Nota: I've also checked, in all cases the video which is retrieved is identical (same size, same codecs, ...) ... so the "fix" induced by using a more appropriate user-agent is only linked to "other" contents, not to the video codec or stuff like that.

benoit74 commented 2 months ago

Solved by https://github.com/openzim/zimit/pull/292

Jaifroid commented 2 months ago

Just to confirm that the solutions B and D both work in the PWA and the Browser Extension. Was version B the adopted solution?

benoit74 commented 2 months ago

Yes, solution B is currently in place in zimit2 branch

benoit74 commented 2 months ago

Yes, solution B is currently in place in zimit2 branch

To be more precise, by default, "Pixel 2" is used as mobile device. Zimit user is free to override this setting with --mobileDevice (as before) or use --noMobileDevice to remove the default and use no mobile device.