Open benoit74 opened 2 days ago
We have encountered this in the past (started in 2022).
As you guessed, the problem happens when the libzim tries to read a ZIM that is being transferred on the FS. Given this is the libzim crashing on a ZIM, I think it's wise to keep it as an Error and crashing the refresh. We have a clear event and log and this is self-recovered in a future job.
What we should do though is what I suggested in that initial comment: move the file with a temp name to its final folder (mount point) and only then rename to .zim
.
To me this is getting more frequent because we are creating more large ZIMs and the library generation is faster and thus running a lot more than it used to.
This happened twice in a row, for the new
dev/benyehuda.org_he_all_2024-10.zim
ZIM. It then worked successfully. I'm quite sure it means that the library generation job is not nicely handling the case where the file is not yet fully uploaded while running. This is is pretty big (22G). It is however a bit surprising we never encountered this situation before. Something probably changed quite recently ...