openzim / youtube

Create a ZIM file from a Youtube channel/username/playlist
GNU General Public License v3.0
54 stars 30 forks source link

biologycourses_en_all: PIL.UnidentifiedImageError: cannot identify image file '/output/xxxx/banner.jpg' #344

Closed benoit74 closed 2 months ago

benoit74 commented 2 months ago

We have a weird error in https://farm.openzim.org/pipeline/caa0a5fb-5429-4bbe-b4ee-400cfc47b311/debug

@Popolechien reconfigured the recipe (which was working) with a --mainColor #FFFFFF and we got a more or less unrelated error:

[youtube2zim::2024-09-23 07:18:10,522] INFO:checking your branding files and values
[youtube2zim::2024-09-23 07:18:11,328] ERROR:Interrupting process due to error: cannot identify image file '/output/tmp2mr3_m9m/banner.jpg'
[youtube2zim::2024-09-23 07:18:11,328] ERROR:cannot identify image file '/output/tmp2mr3_m9m/banner.jpg'
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/youtube2zim/scraper.py", line 316, in run
    self.check_branding_values()
  File "/usr/local/lib/python3.12/site-packages/youtube2zim/scraper.py", line 561, in check_branding_values
    resize_image(self.banner_path, width=1060, height=175, method="thumbnail")
  File "/usr/local/lib/python3.12/site-packages/zimscraperlib/image/transformation.py", line 30, in resize_image
    with pilopen(src) as image:
         ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/PIL/Image.py", line 3498, in open
    raise UnidentifiedImageError(msg)
PIL.UnidentifiedImageError: cannot identify image file '/output/tmp2mr3_m9m/banner.jpg'
Popolechien commented 2 months ago

Wait, could it be because I also tried to add a banner image? I have removed the image link and restarted the recipe.

Popolechien commented 2 months ago

Bingo, that worked, problem was with the banner image. Not sure how to test whether the issue was the image, the repo, the link, the recipe or the scraper

benoit74 commented 2 months ago

Thank you, didn't noticed this, I will have a look then.

benoit74 commented 2 months ago

OK, so the problem is that you used https://drives.kiwix.org/view/Corrected%20Logos%20for%20recipes/biology_courses_banner.jpg while this URL is valid only for humans (it is not the image itself but an HTML page with the image on it). For machines (i.e. the Zimfarm) you still need to use the old URL from drive.farm.openzim.org (https://drive.farm.openzim.org/Corrected%20Logos%20for%20recipes/biology_courses_banner.jpg)

In other words : upload to drives.kiwix.org, grab link from drive.farm.openzim.org