offspot / imager-service

Create Kiwix Hotspot microSD cards online
https://imager.kiwix.org/
GNU General Public License v3.0
14 stars 6 forks source link

Trying to start new config or edit old one returns Server 500 error #306

Open Popolechien opened 1 year ago

Popolechien commented 1 year ago

Capture d’écran 2022-12-16 à 17 50 19

rgaudin commented 1 year ago

This is due to a ZIM with incorrect metadata in the library.

coopmaths.:
    description: "Ressources libres pour la personnalisation des apprentissages en
      math\xE9matiques"
    id: 2081f380-2241-91ea-3b9e-b17b9538cdd7
    langid: coopmaths.
    language: ''
    name: Coopmaths
    sha256sum: e63ca2d238464c7dff7f9f721614d823d2fcb7ee5803268dd8461d5b8f304a13
    size: 393975808
    sw: y
    type: zim
    url: http://download.kiwix.org/zim/zimit/coopmaths_2022-12.zim
    version: '2022-12-13'
kelson42 commented 1 year ago

It seems clear something has to be done somewhere. We should keep the ticket open as long as this is not clear.

kelson42 commented 1 year ago

Will check with zimit as we should not end up with a ZIM without Language. Actually I think the ZIM might have had an incorrect Language and the library/ideascube gen script might have removed it because it was not correct.

A ticket should be open immediatly in Zimit/Warc2zim and probably fixed soon. But We should have other checks IMO. I wonder for example if zimcheck detects this properl, etc...

rgaudin commented 1 year ago

I confirm the scraper does set the Language tag but it would keep what the user provided if it was an incorrect ISO-339-3 code. Fixed in https://github.com/openzim/warc2zim/commit/cd69c6737e85030f6a755c5479917aa134cb672c

I am not sure if all scrapers do this kind of check ; certainly not zimwriterfs. I am not sure how much we want to enforce this neither. That's debatable.

One of the culprit here is library-to-offspot that would parse the code from kiwix library (to convert it to ISO-639-1) but there was en edge case and it resulted in an empty string. Fixed in https://github.com/kiwix/k8s/commit/5eb83e58a56ca13ccdbfe4d18f4b591d207f4487

We should definitely discuss the larger issue: how flexible should those metadata be: can a user enter an incorrect Language metadata and what should our tools do about it. Keeping in mind that Language code can evolve (although very rare).

I don't think it's wise to invest time in hardening cardshop/hotspot given this will change with OPDS switch. Being fed a valid catalog is a fair assumption IMO. Fixing catalog would benefits multiple tools so it makes more sense.

Popolechien commented 1 year ago

Should we pause zimfarm recipes operations until we have a formal, validated training for users and maybe implement a couple protections in the code ?

Since mwoffliner is half-broken and zimit isn't really reliable, I'd suggest we limit new zim files to Youtube videos and the like.

rgaudin commented 1 year ago

Rule #1 would be, anyway, to point new recipes to dev and only once validated that everything is OK, should it be moved to the actual repo. If the ZIM took a very long time to create, we can move the file from dev to prod once green lighted