kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
7 stars 0 forks source link

Library generation fails with `Dirent pointer table outside (or not fully inside) ZIM file` #283

Open benoit74 opened 2 days ago

benoit74 commented 2 days ago
Installed /usr/local/bin/library-maint from https://raw.githubusercontent.com/kiwix/k8s/main/zim/library-mgmt/library-maint.py
starting…
2024-10-13 05:00:32,347 INFO Starting library-maint for read, write-libraries
2024-10-13 05:00:32,347 INFO [READ] Loading previous Public Library
2024-10-13 05:00:32,347 WARNING [READ] Unbale to read previous library. Purging disabled.
2024-10-13 05:00:32,437 DEBUG [READ] 0000 dev/Bundesministerium_fuer_Inneres_2024-07.zim
2024-10-13 05:00:32,439 DEBUG [READ] 0001 dev/africanstorybook.org_mul_all_newui_2023-10.zim
2024-10-13 05:00:32,443 DEBUG [READ] 0002 dev/alexandria.dk_en_all_2024-10.zim
2024-10-13 05:00:32,445 DEBUG [READ] 0003 dev/ancient.eu_en_all_2024-08.zim
2024-10-13 05:00:32,451 DEBUG [READ] 0004 dev/api.plos.org_en_all_2024-08.zim
2024-10-13 05:00:32,453 DEBUG [READ] 0005 dev/ashevillerelief.com_en_all_2024-10.zim
2024-10-13 05:00:32,455 DEBUG [READ] 0006 dev/avanti-3dimensional-geometry_2024-10.zim
2024-10-13 05:00:32,537 DEBUG [READ] 0007 dev/banrepcultural.org_es_enciclopedia_2024-09.zim
2024-10-13 05:00:32,539 DEBUG [READ] 0008 dev/benyehuda.org_he_all_2024-10.zim
2024-10-13 05:00:32,540 ERROR FAILED. An error occurred: Dirent pointer table outside (or not fully inside) ZIM file.
2024-10-13 05:00:32,540 ERROR Dirent pointer table outside (or not fully inside) ZIM file.
Traceback (most recent call last):
  File "/usr/local/bin/library-maint", line 948, in entrypoint
    sys.exit(maint.run())
  File "/usr/local/bin/library-maint", line 747, in run
    self.readfs()
  File "/usr/local/bin/library-maint", line 514, in readfs
    entry = self.read_zimfile_info(
  File "/usr/local/bin/library-maint", line 458, in read_zimfile_info
    zim = Archive(fpath)
  File "libzim/libzim.pyx", line 717, in libzim.Archive.__cinit__
RuntimeError: Dirent pointer table outside (or not fully inside) ZIM file.
Stream closed EOF for zim/dev-library-generator-28813260-vhsbr (debian)

This happened twice in a row, for the new dev/benyehuda.org_he_all_2024-10.zim ZIM. It then worked successfully. I'm quite sure it means that the library generation job is not nicely handling the case where the file is not yet fully uploaded while running. This is is pretty big (22G). It is however a bit surprising we never encountered this situation before. Something probably changed quite recently ...

rgaudin commented 2 days ago

We have encountered this in the past (started in 2022).

As you guessed, the problem happens when the libzim tries to read a ZIM that is being transferred on the FS. Given this is the libzim crashing on a ZIM, I think it's wise to keep it as an Error and crashing the refresh. We have a clear event and log and this is self-recovered in a future job.

What we should do though is what I suggested in that initial comment: move the file with a temp name to its final folder (mount point) and only then rename to .zim.

To me this is getting more frequent because we are creating more large ZIMs and the library generation is faster and thus running a lot more than it used to.