Closed rgaudin closed 4 years ago
Interesting. It goes into handle_unoptimized_files() only if unoptimized_dir is present. But then it fails inside that function as unoptimized_dir isn't present. Looking into it.
Possibly related: There had been a 5031 directory in 5030. It was removed in March.
It actually happened in a case if unoptimized_dir only contains HTML format book as we process this before any other files. Now, it's processing goes well. However, as the folder now contains no files, it gets deleted, but the scraper proceeds with executing the leftover code in handle_unoptimized_files(). (Fixed that with a simple return).
Another problem related to this arises for other formats if optimized HTML file is already present in static, it would go on proceeding for other formats. However, other format files are not there (as unoptimized_dir only contains HTML) So, fixed this by checking first source files of other formats and then processing them. (This also prevents failure if for some reason, a specific format file is not available in unoptimized_dir, either as it was downloaded from cache or the download failed)
Latest zimfarm run failed due to a missing file