openzim / youtube

Create a ZIM file from a Youtube channel/username/playlist
GNU General Public License v3.0
46 stars 26 forks source link

Banner image is not taken into account #342

Open Popolechien opened 1 week ago

Popolechien commented 1 week ago

Looking at several of our latest Youtube channels I see that the banner image is never included in the corresponding zim file.

See, e.g.: Youtube Zim
S2 Underground YT S2 zim
Voice of America YT VoA zim
Canadian prepper YT Canadian prepper ZIM
Astrolabe YT Astrolabe zim

Edit: this happens both when an entire channel or playlists within a channel are selected.

benoit74 commented 1 week ago

Thank you for reporting, this is indeed a problem.

benoit74 commented 1 week ago

This is in fact "normal" behavior, see https://github.com/openzim/youtube/blob/0b9bed587f5a2cb890d272aee848b89cb2fb5fe8/CHANGELOG#L137-L139

We can have a look again whether google changed again the API and we can retrieve the banner again.

dan-niles commented 1 week ago

Unfortunately Google's API hasn't still changed to support retrieving banners again. As an alternative we could use something like BeautifulSoup to scrape the banner link off a channel and use it. I tested it with the code below:

import requests
from bs4 import BeautifulSoup
import json

channel_url = "https://www.youtube.com/@danasherniles"
response = requests.get(channel_url)
soup = BeautifulSoup(response.content, 'html.parser')

for script in soup.find_all("script"):
    if 'ytInitialData' in script.text:
        json_data = json.loads(script.text.split(' = ')[1].rstrip(';'))
        banner_url = json_data["header"]["pageHeaderRenderer"]["content"]["pageHeaderViewModel"]["banner"]["imageBannerViewModel"]["image"]["sources"][-1]["url"]
        print(banner_url)
        break
benoit74 commented 5 days ago

Is this really worth it? I'm not against this, but it does look a bit fragile. And setting the banner image manually is not "that hard", so probably not the most priority issue to solve. That being said, I'm not against someone proposing a PR on this issue, just wondering which focus we should put on this issue.

kelson42 commented 9 hours ago

Is this really worth it? I'm not against this, but it does look a bit fragile. And setting the banner image manually is not "that hard", so probably not the most priority issue to solve. That being said, I'm not against someone proposing a PR on this issue, just wondering which focus we should put on this issue.

Yes, fragile... but seems to be the best we can do... and the banner is pretty important!