kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
5 stars 0 forks source link

library.kiwix.org is not updated anymore from new ZIMs #181

Closed Popolechien closed 2 months ago

Popolechien commented 2 months ago

I see that https://download.kiwix.org/zim/wikipedia/wikipedia_ru_all_maxi_2024-04.zim has been completed and uploaded to download.kiwix.org, but the oldest zim (2023-11) is still there. The file is also not available on library.kiwix.org

This is only for the maxi flavour, however. Both mini and nopic are there.

benoit74 commented 2 months ago

Library update job is indeed failing since few hours.

Problem is linked to the creation of zimit/youscribe_fr_primaire_2024-04.zim while youscribe_fr_primaire previous version is at other/youscribe_fr_primaire_2023-03.zim.

Someone (@RavanJAltaie?) obviously changed the warehouse path from other to zimit. Is it intentional? Should I move all youscribe_fr_primaire to zimit warehouse path? All youscribe? Please explain what is intended here.

In the mean time, I archived the new zimit/youscribe_fr_primaire_2024-04.zim so that library can be updated again (restoring it will be immediate, I just need precise instructions, no need to run again the farm recipe) and restarted the update job. Library should be updated in more or less 15/20 minutes from now if there is not another issue.

benoit74 commented 2 months ago

For the techies, the log was

│ 2024-04-09 12:54:59,852 DEBUG [READ] 6846 other/youscribe_fr_college_2020-02.zim                                                                                                                                                                                             │
│ 2024-04-09 12:54:59,854 DEBUG [READ] 6847 other/youscribe_fr_lycee_2023-03.zim                                                                                                                                                                                               │
│ 2024-04-09 12:55:00,001 DEBUG [READ] 6848 other/youscribe_fr_lycee_2020-02.zim
│ 2024-04-09 12:55:00,003 DEBUG [READ] 6849 zimit/youscribe_fr_primaire_2024-04.zim                                                                                                                                                                                            │
│ 2024-04-09 12:55:00,220 DEBUG >> is update youscribe_fr_primaire: 89c6e919-f9bc-c8e8-f82b-e9c00be55c9f                                                                                                                                                                       │
│ 2024-04-09 12:55:00,220 DEBUG [READ] 6850 other/youscribe_fr_primaire_2023-03.zim                                                                                                                                                                                            │
│ 2024-04-09 12:55:00,225 ERROR FAILED. An error occurred: 'id'                                                                                                                                                                                                                │
│ 2024-04-09 12:55:00,226 ERROR 'id'                                                                                                                                                                                                                                           │
│ Traceback (most recent call last):                                                                                                                                                                                                                                           │
│   File "/usr/local/bin/library-maint", line 943, in entrypoint                                                                                                                                                                                                               │
│     sys.exit(maint.run())                                                                                                                                                                                                                                                    │
│   File "/usr/local/bin/library-maint", line 742, in run                                                                                                                                                                                                                      │
│     self.readfs()                                                                                                                                                                                                                                                            │
│   File "/usr/local/bin/library-maint", line 527, in readfs                                                                                                                                                                                                                   │
│     logger.debug(f">> is update {alias}: {entry['id']}")                                                                                                                                                                                                                     │
│ KeyError: 'id'

The problem is that we assume the list of ZIM to be alphabetically ordered and to save time we read only the data of first ZIM for a given alias, to save processing time. I.e. zimit/youscribe_fr_primaire_2024-04.zim should have been processed after other/youscribe_fr_primaire_2023-03.zim, not before.

@rgaudin is this a known limitation of the library update job or should I log a ticket? (might even be voluntary, I don't think we want to update the warehouse path without moving old ZIMs)

benoit74 commented 2 months ago

Incident is resolved, library is now updated again.

rgaudin commented 2 months ago

It's prohibited to change warehouse path of existing ZIM without informing operations. Content team knows this but since it's not frequent, it's probably been forgotten.

That's a known limitation of the script from day 1

kelson42 commented 2 months ago

It is concerning that things fail silently or at least without a very clear (alarm) message. We need to open a ticket to find a better solution.

rgaudin commented 2 months ago

It doesn't fail silently. Jobs are in error. We just don't have alarms for this and we should ; definitely

Popolechien commented 2 months ago

Reopening this ticket as I have the exact same issue with https://farm.openzim.org/recipes/wikipedia_ja_medicine - updated more than 6 hours ago, still the old zim(s) available in the library.

benoit74 commented 2 months ago

Same cause, same consequence.

We had before zimit/editions-ganndal_fr_fo-livres_2024-04.zim and now we have other/editions-ganndal_fr_fo-livres_2024-04.zim.

I archived the offending other/editions-ganndal_fr_fo-livres_2024-04.zim and opened https://github.com/openzim/zim-requests/issues/962

@RavanJAltaie @Popolechien you need to remember that it is NOT possible to change a ZIM warehouse folder in Zimfarm without prior asking devs to moving existing ZIMs to the new warehouse folder. Request has to be opened in zim-request to ask for the change.

rgaudin commented 2 months ago

This ZIM should NOT be in the public library at the moment. It's content is not free. And I don't understand how it ended up in zimit folder since it's not using zimit scraper...

Popolechien commented 2 months ago

Yeah I'm a little suprised that zims should move like this. I certainly haven't touched it, and with @RavanJAltaie being away most of the past couple of weeks I doubt she did (ganndal is quite specific and she did not take part in its creation). I suppose we don't have a log of operations?

rgaudin commented 2 months ago

No we don't. It's a long time feature request though. To be faire I should have disabled the recipe (which was previously in dev) when we agreed it would not go public. I see that it is disabled at the moment so someone did that. That's mysterious!

kelson42 commented 2 months ago

what is the measure/issue to secure tech people know about a failure before end user notice?

benoit74 commented 2 months ago

what is the measure/issue to secure tech people know about a failure before end user notice?

https://github.com/kiwix/k8s/issues/182

RavanJAltaie commented 2 months ago

Unfortunately, I'm the one who changed the warehouse paths for both Ganndal and Youscribe primair. I didn't know earlier that changing warehouse path would create a bug and should be informed to operation as this is the first time I conclude such an action, I'm aware now and would put these rules into consideration. @benoit74 I see that you've disabled the Youscribe Primaire recipe, is that for fixing the bug?

RavanJAltaie commented 2 months ago

On another hand I have the Spanish version of Marxist.org succeeded and pushed to the library but it doesn't show up in the library, is that because of the same bug? @benoit74

benoit74 commented 2 months ago

@benoit74 I see that you've disabled the Youscribe Primaire recipe, is that for fixing the bug?

Yes, see https://github.com/openzim/zim-requests/issues/961

On another hand I have the Spanish version of Marxist.org succeeded and pushed to the library but it doesn't show up in the library, is that because of the same bug?

Yes, I thought I've fixed the issue with editions-ganndal_fr_fo but in fact library update is still failing due to the comeback of zimit/youscribe_fr_primaire_2024-04.zim. I've archived the ZIM again.