openzim / zim-requests

Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!
https://farm.openzim.org
35 stars 2 forks source link

Zimit 2 tests: solar.lowtechmagazine.com #1023

Closed benoit74 closed 2 weeks ago

benoit74 commented 4 weeks ago

A new Zimit2 ZIM is ready for testing

Scraper: warc2zim 2.0.0-dev9 + zimit 2.0.0-dev5 + Browsertrix crawler 1.1.3 ZIM: https://mirror.download.kiwix.org/zim/.hidden/dev/solar.lowtechmagazine.com_en_all_2024-05.zim Library link: https://dev.library.kiwix.org/viewer#solar.lowtechmagazine.com_en_all_2024-05 Search link in library: https://dev.library.kiwix.org/#lang=eng&q=solar

Suggested test plan:

kelson42 commented 3 weeks ago

@benoit74 To me it works perfectly but I wonder if we should not split the ZIM in one ZIM per language before putting back in prod.

benoit74 commented 3 weeks ago

I would recommend to first publish in prod with Zimit2 (so that all readers will be able to read this book) and then split with one ZIM per language (needs help from a dev - me - to create proper CSS to hide translations in various places + create proper include/exclude configurations).

kelson42 commented 3 weeks ago

We should update the recipe to put properly all the languages.

Jaifroid commented 3 weeks ago

All working well in both the PWA and the Browser Extension (in ServiceWorker mode). In JQuery mode, images do not show on the landing page, but that is a limitation of the mode, not of the ZIM (they do show in articles).

If splitting by languages proves difficult, then the ZIM should be renamed with a _mul_ language code, rather than _en_. There is one advantage in having a multi-lingual ZIM, which is that choice of language via the dropdown is much easier (and may be automatic according to browser language, though I didn't test). It would be interesting to see if the size reduces radically with one language, or whether (if the images are not stored more than once), there is little size reduction.

Jaifroid commented 3 weeks ago

One small issue I noticed is that clicking on the icon for non-dithered images (beneath each image in an article) doesn't work. The higher-res images haven't been scraped and show a missing image placeholder.

Screenshot_20240605-124606_Chrome

benoit74 commented 3 weeks ago

The higher-res images haven't been scraped and show a missing image placeholder.

This is a Zimit1 issue as well, crawling is the problem. A custom behavior clicking on all images would help. I've opened https://github.com/openzim/zim-requests/issues/1027 to track this enhancement.