kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
5 stars 0 forks source link

Zimfarm task should fail if upload fails #192

Open Popolechien opened 1 month ago

Popolechien commented 1 month ago

I created a zim file in arabic whose run is now complete but it fails to appear in dev.library.kiwix.org

Not sure it's related to #181 one way or the other, but here we are.

Popolechien commented 1 month ago

Ah and when trying to download https://mirror.download.kiwix.org/zim/.hidden/dev/wearesaudis.com_ar_all_2024-05.zim I get a 404 error.

benoit74 commented 1 month ago

Having a space at the beginning of ZIM name is probably not a good idea 🤷🏻‍♂️

Proper download link is https://mirror.download.kiwix.org/zim/.hidden/dev/%20wearesaudis.com_ar_all_2024-05.zim

This emphasis again the need to put much stronger constraints on ZIM names ^^

benoit74 commented 1 month ago

But proper link also gives a 404 ... looks like ZIM is lost somewhere

benoit74 commented 1 month ago

Upload of the ZIM simply failed:

[2024-05-10 01:44:05,821: WARNING] Upload failed: 1 attempts remaining.
[2024-05-10 01:44:05,821: INFO] Pausing for 180s
[2024-05-10 01:47:05,844: INFO] Starting upload of /data/ wearesaudis.com_ar_all_2024-05.zim to sftp://uploader@warehouse.farm.openzim.org:30122/zim/.hidden/dev/ wearesaudis.com_ar_all_2024-05.zim
[2024-05-10 01:47:05,845: INFO] Executing: /usr/bin/sftp -i /etc/ssh/keys/zimfarm -b /tmp/tmp6wfftk8y.txt -o GlobalKnownHostsFile /etc/ssh/known_hosts -c aes128-ctr sftp://uploader@warehouse.farm.openzim.org:30122/zim/.hidden/dev/
[2024-05-10 01:47:06,823: ERROR] sftp failed returning 1:: sftp> put /data/ wearesaudis.com_ar_all_2024-05.zim  wearesaudis.com_ar_all_2024-05.zim.tmp
/data/ is not a regular file

ZIM is simply gone ...

Please fix the recipe and run it again

benoit74 commented 1 month ago

Closing this as "won't fix" since this is not supposed to happen and the proper solution to avoid this to happen is tracked in https://github.com/openzim/zimfarm/issues/783 anyway

Popolechien commented 1 month ago

Re-opening this ticket as I get the same issue with https://farm.openzim.org/pipeline/1e280982-39dd-4844-892f-d164e5df1204 (my second run actually, recipe is here: https://farm.openzim.org/recipes/bmi.gv.at_de_all )

rgaudin commented 1 month ago

@Popolechien ZIM files containing spaces are not allowed in the zimfarm (cannot be uploaded). See ZIMs-Naming-Convention.

kelson42 commented 1 month ago

@rgaudin @benoit74 The answer given to @Popolechien seems to short IMHO. If a ZIM file can not be handled in the library, we need a clear error, even if the root cause is somewhere around a scraper or a recipe.

kelson42 commented 1 month ago

Maybe the solution is to put a task in fail status if ZIM file(s) are not successfuly uploaded?

rgaudin commented 1 month ago

The zimfarm clearly shows that it created a ZIM but did not upload it. Hence the ticket. The root cause is the fact that we dont allow uploading ZIM with spaces. @benoit74 mentioned it but the recipe has not been changed but ran again… We already have a ticket, that @benoit74 mentioned, regarding enforcing input validations.

rgaudin commented 1 month ago

Maybe the solution is to put a recipe in fail status id ZIM files are not auccesfuly uploaded?

We already have a ticket for this

rgaudin commented 1 month ago

What is there to do at infra level (this is k8s repo) about this?

kelson42 commented 1 month ago

@rgaudin It does not sound wrong to open this issue here at first. If the discussion leads to an agreement that something has to be changed in Zimfarm then we will move it obviously. We don't have reached this stage yet IMHO.

Many reasons can lead to an upload failure, the closed tickets at Zimfarm give many examples. It is misleading to let believe the Zimfarm user that things wen well. @benoit74 Could we agree on this?

benoit74 commented 3 weeks ago

The fact that Zimfarm task should fail if upload fails is already tracked in https://github.com/openzim/zimfarm/issues/684

The fact that we should prevent to create ZIM file with bad names is tracked in https://github.com/openzim/zimfarm/issues/783

I'm strongly supportive of these two issues, one of them is prio1 (will be solved soon hopefully), the other one seems to still be a the "question" stage.

I don't see what's left to be done at kiwix/operations side.