Open holta opened 1 month ago
2024-06-15 14:11:34 - [Debug] [https://www.youtube.com/watch?v=5BO9nhtF0Cc]: Unrecoverable error matched. [download] Analyse du 15 juin 2024: Controverse autour de la nomination du nouveau directeur de cabinet de FM-bM-^@M-& does not pass filter (live_status=?not_live);skipping ..
The error suggests the video is still a live video. On YouTube, the video is advertised as a finished live stream.
I bet the live_status column in media table (xklb-metadata) says it's still a live. For the moment, what I understand is the video need to pass the match filter per:
Progress!
@nzola please also take note if recently "live" YouTube videos download OR do not download β in case these's a clear pattern as to which succeed and which fail? π
Progress!
@nzola please also take note if recently "live" YouTube videos download OR do not download β in case these's a clear pattern as to which succeed and which fail? π
@holta @deldesir I understand that these videos were live when I tried to download them??? I will try downloading them again tomorrow, Sunday and give the report.
@deldesir I tried to download this playlist now: https://www.youtube.com/@TOPCONGOFM/playlists It failed.
Just remembered. I downloaded the same playlist yesterday, it failed
PUBLISHING TO URL... https://dpaste.com/H4SYGYT8U
@nzola This is a playlist of playlists. I am trying to download it right now on my side (LRN2). Although only a subset of 100 videos will be attempted, the metadata fetch is going to take very long to gather metadata for all videos in every playlists found. I will report back on errors found. This may help me understanding the issues you have encountered.
ASIDE: Of course YouTube videos that are truly "live" (happening at that moment) are not possible to download β until after the live event is complete. π€ποΈ
Downloading https://www.youtube.com/@TOPCONGOFM/playlists failed due to an unavailable video ( https://www.youtube.com/watch?app=desktop&v=n_iRE9al044). This needs a closer look. Thanks @Nzola for having reported on this.
NON-URGENT:
"Tasks" view needs to use more clear language than bureaucratic language like "unavailable video" eventually.
So regular teachers in all countries know what this really means π
Ideally with actionable suggestions (in those cases where that's realistic!)
I tried again to download these 7 videos now , but still failed. I did not do any upgrade or new install. PUBLISHING TO URL... https://dpaste.com/F6S6YX798
I tried again to download these 7 videos now , but still failed. I did not do any upgrade or new install.
Curiously the screenshot shows no error message / explanation at all, during this 2nd attempt.
(Definitely room for improvement, thanks @nzola.)
The error message will be displayed now, but the videos will not download. Live videos (finished or not) are not downloadable with xklb by design.
Live videos (finished or not) are not downloadable with xklb by design.
By whose design?
FYI this design assumption seems extremely weak, given how videos labeled as "live" are actually used:
The error message will be displayed now
@deldesir clarify which PR and code improved error reporting here?
Thanks, please if possible!
The error message will be displayed now
@deldesir clarify which PR and code improved error reporting here?
Thanks, please if possible!
Per adjustments made in PR https://github.com/iiab/calibre-web/pull/194
@deldesir please try to find a legit way to tell if a YouTube video is actually live or not.
(Instead of the bogus information that we're currently using β that's as good as useless β given the fact that so many podcasters permanently leave all their episodes marked as "live" ...rather intentionally... as permanently labeling videos as "live" serves as de facto marketing it would appear!)
@deldesir please try to find a legit way to tell if a YouTube video is actually live or not.
If something like this can be upstreamed to become a part of xklb, even better!
β
@deldesir please test & use these URLs to make sure forward progress is steady in coming days β thank you to everyone working on this very common and very serious problem:
[ ADDITIONAL TEST CASES BELOW E.G. FOR "NOT YET LIVE" YOUTUBE URL'S! ]
The error message will be displayed now, but the videos will not download. Live videos (finished or not) are not downloadable with xklb by design.
Ok. I understands. Thank you.
@nzola @avni
Until we solve this serious problem properly...
@deldesir believes that an initial hack/workaround should confirm the path forward, using yt-dlp options like...
@nzola @avni
Until we solve this serious problem properly...
@deldesir believes that an initial hack/workaround should confirm the path forward, using yt-dlp options like...
Ok @holta
@nzola experienced many such errors.
Videos that were once live appear to be erroneously blocked by IIAB Calibre-Web.
Example of a video that should be downloading, but fails to download:
- https://youtu.be/rbEsoe8F-l4
- https://www.youtube.com/live/rbEsoe8F-l4 (false positive possibly b/c YouTube prefers this "live" link???)
@nzola mentioned:
Downloaded these playlists [and] thumbnails without any problems: https://www.youtube.com/channel/UCX9j__vYOJu00iqBrCzecVw https://www.youtube.com/playlist?list=PL1mP_vkqPB7EsIqqfwcGsg2rQNzoVy0mk
But cannot download the following single videos: https://www.youtube.com/watch?v=BK0XGf20l84 https://www.youtube.com/watch?v=VCM8tg_mGSw https://www.youtube.com/watch?v=w8snrdaoTUs&t=2s https://www.youtube.com/watch?v=5BO9nhtF0Cc https://www.youtube.com/watch?v=rbEsoe8F-l4&t=7788s https://www.youtube.com/watch?v=Drec4XAMJzI&t=6737s https://www.youtube.com/watch?v=w8snrdaoTUs&t=7s
VM's iiab-diagnostics: https://dpaste.com/6H8F53GPQ
@deldesir: Any idea what's happening?
JFYI: I downloaded all these videos on CMD with yt-dlp.
@deldesir
@nzola suggests the challenge is not downloading "formerly live" YouTube videos...
But rather, the challenge appears to be (programmatically, reliably) identifying "formerly live" YouTube videos...
Can you confirm?
iiab-diagnostics: https://dpaste.com/7XBX3XZE2
I can replicate the error, and see, "failed to download [download] ... does not pass filter (live_status=?not_live); skipping" when downloading:
@deldesir believes that an initial hack/workaround should confirm the path forward, using yt-dlp options like...
It's not clear to me what the workaround is or what I need to do to test? Is the ask to use the command line and testing the download with yt-dlp?
@deldesir should outline a suggested path forward within 24h:
As mentioned above, downloading "formerly live" videos via yt-dlp seems to work fine, no options/flags required.
The real issue seems to be automatically distinguishing between "truly live" YouTube videos and "formerly live" YouTube videos.
And then presumably pushing that functionality upstream, e.g. into xklb?
As mentioned above, downloading "formerly live" videos via yt-dlp seems to work fine, no options/flags required.
Here's the output from my terminal as confirmation. I don't know how to view the webm files, but presuming they succeeded based on the output.
ubuntu@box: $ sudo yt-dlp https://www.youtube.com/watch?v=rbEsoe8F-l4 [youtube] Extracting URL: https://www.youtube.com/watch?v=rbEsoe8F-l4 [youtube] rbEsoe8F-l4: Downloading webpage [youtube] rbEsoe8F-l4: Downloading ios player API JSON [youtube] rbEsoe8F-l4: Downloading m3u8 information [info] rbEsoe8F-l4: Downloading 1 format(s): 247+251 [download] Destination: LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f247.webm [download] 100% of 541.37MiB in 00:01:31 at 5.90MiB/s [download] Destination: LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f251.webm [download] 100% of 126.31MiB in 00:00:47 at 2.67MiB/s [Merger] Merging formats into "LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].webm" Deleting original file LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f251.webm (pass -k to keep) Deleting original file LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f247.webm (pass -k to keep) ubuntu@box:~$ ls 'LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].webm'
ubuntu@box: $ sudo yt-dlp https://www.youtube.com/live/rbEsoe8F-l4 [youtube] Extracting URL: https://www.youtube.com/live/rbEsoe8F-l4 [youtube] rbEsoe8F-l4: Downloading webpage [youtube] rbEsoe8F-l4: Downloading ios player API JSON [youtube] rbEsoe8F-l4: Downloading m3u8 information [info] rbEsoe8F-l4: Downloading 1 format(s): 247+251 [download] LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].webm has already been downloaded ubuntu@box: $ rm 'LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].webm' rm: remove write-protected regular file 'LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].webm'? yes ubuntu@box:~$ sudo yt-dlp https://www.youtube.com/live/rbEsoe8F-l4 [youtube] Extracting URL: https://www.youtube.com/live/rbEsoe8F-l4 [youtube] rbEsoe8F-l4: Downloading webpage [youtube] rbEsoe8F-l4: Downloading ios player API JSON [youtube] rbEsoe8F-l4: Downloading m3u8 information [info] rbEsoe8F-l4: Downloading 1 format(s): 247+251 [download] Destination: LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f247.webm [download] 100% of 541.37MiB in 00:00:19 at 28.26MiB/s [download] Destination: LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f251.webm [download] 100% of 126.31MiB in 00:00:19 at 6.53MiB/s [Merger] Merging formats into "LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].webm" Deleting original file LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f247.webm (pass -k to keep) Deleting original file LE DEBAT 24 MAI 2024 [rbEsoe8F-l4].f251.webm (pass -k to keep)
Progress brewing!
@deldesir
1) Please confirm/clarify the question @nzola surfaced 3 days ago β isn't the real problem here programmatically identifying "actually live" versus "formerly live" YouTube videos?
2) How will downloading work with "actually live" YouTube videos (i.e. whose recording is not yet complete) β e.g. as outlined in the 12-hour tennis match question at https://github.com/chapmanjacobd/library/pull/41#issuecomment-2186701059 ?
(Thanks much for clarifying assumptions, intuition, intentions ~ so the end goal is very clear to everyone here!)
- Please confirm/clarify the question @nzola surfaced 3 days ago β isn't the real problem here programmatically identifying "actually live" versus "formerly live" YouTube videos?
You mean the live_status
of the video, whether it's live
or was_live
? Absolutely not. The real problem is because the live_status
column in xklb-metadata.db doesn't have the value not_live
. The match_filter from xklb in https://github.com/chapmanjacobd/library/blob/e9975b07bca5b481aeed9398d0bc0adb3b9b25c8/xklb/createdb/tube_backend.py#L418 forces yt-dlp to reject/skip the download because the video doesn't have this not_live
status/criterion.
- How will downloading work with "actually live" YouTube videos (i.e. whose recording is not yet complete) β e.g. as outlined in the 12-hour tennis match question at Add --live optionΒ chapmanjacobd/library#41 (comment) ?
Once chapmanjacobd/library#41 is merged, if ever it is approved, live videos will download on IIAB Calibre-Web. Additional testing will be needed to ensure performance is not overly affected by these generally huge recordings.
@deldesir let me reframe / re-ask my question from https://github.com/chapmanjacobd/library/pull/41#issuecomment-2186701059 more directly:
CAVEAT: I concede that some people who click on "Download to IIAB" don't mind extremely slow downloads of "actually live" videos.
Additional testing will be needed to ensure performance is not overly affected by these generally huge recordings.
Yes that issue of overweight videos is very important. And very well known since 2023[*]. But please let's declare that separate question (of overly long duration, bandwidth-heavy, disk-heavy, and RAM/memory-heavy videos) to be off-topic for now β at least in this particular context here: π
[*] ASIDE: Hopefully to be solved in a few short months, or possibly much earlier!
ALSO: Pre-announced / Pre-scheduled / Upcoming videos ("not yet live" !) are a whole other category of YouTube URL's...
...that I didn't realize also matter!
(Presumably these too are commonly categorized as "live" videos, even though no video whatsoever exists yet!?)
βΆοΈ Certainly we need some kind of intelligent user-facing warning or messaging β in "Tasks" view or similar β to warn teachers / parents when they're trying to "Download to IIAB" a video that doesn't yet exist! β³
βΆοΈ Many examples of "not yet live" (UPCOMING) and "truly live" (LIVE) YouTube videos below β usable as test cases to ensure the "Download to IIAB" button operates cleanly for all:
3 kinds of "ostensibly live" YouTube videos tested here, thanks to @deldesir:
With PR #199 now merged, a follow-up PR is now needed to clean up, as described here:
@nzola experienced many such errors.
Videos that were once live appear to be erroneously blocked by IIAB Calibre-Web.
Example of a video that should be downloading, but fails to download:
- https://youtu.be/rbEsoe8F-l4
- https://www.youtube.com/live/rbEsoe8F-l4 (false positive possibly b/c YouTube prefers this "live" link???)
@nzola mentioned:
Downloaded these playlists [and] thumbnails without any problems: https://www.youtube.com/channel/UCX9j__vYOJu00iqBrCzecVw https://www.youtube.com/playlist?list=PL1mP_vkqPB7EsIqqfwcGsg2rQNzoVy0mk
But cannot download the following single videos: https://www.youtube.com/watch?v=BK0XGf20l84 https://www.youtube.com/watch?v=VCM8tg_mGSw https://www.youtube.com/watch?v=w8snrdaoTUs&t=2s https://www.youtube.com/watch?v=5BO9nhtF0Cc https://www.youtube.com/watch?v=rbEsoe8F-l4&t=7788s https://www.youtube.com/watch?v=Drec4XAMJzI&t=6737s https://www.youtube.com/watch?v=w8snrdaoTUs&t=7s
VM's iiab-diagnostics: https://dpaste.com/6H8F53GPQ
@deldesir: Any idea what's happening?
The playlist https://www.youtube.com/@TOPCONGOFM/playlists downloaded from 1 to 100% then it failed PUBLISHING TO URL... https://dpaste.com/4HVB2SFVK
playlist https://www.youtube.com/@TOPCONGOFM/playlists downloaded from 1 to 100% then it failed
@nzola the above is not a playlist. It is a list of playlists. Can you try again with one of its individual playlists? For example, maybe start with their 8-video playlist https://youtube.com/playlist?list=PLi6Z1Wvj99SKPHMiqLkREMQrMyZy1YTPE ?
@deldesir can you please clean up the error message so that @nzola and others have a crystal clear error message β instead of, or alongside Metadata Fetch: [URL] failed: unsupported operand type(s) for /: 'NoneType' and 'int'
? (We can open this message in a new ticket if that makes things easier!)
Conversely: If a list of playlists can easily in future be downloaded into a single bookshelf, much like we do with a YouTube channel (a YouTube channel is basically also a list of playlists!) then we might consider that functionality, if it's genuinely needed?
@nzola and AFTER you've completed testing of individual playlist(s):
Try the much more ambitious experiment of downloading the entire channel β e.g. try the "Download to IIAB" button with URLs like:
Downloaded this playlist:https://www.youtube.com/playlist?list=PLi6Z1Wvj99SLJRvpu2S6CPRHvECynHS0y
(99 videos) with pi400 Calibre-Web. Successfully completed after 8 hours.
PUBLISHING TO URL... https://dpaste.com/3LGEYAQLE
This error happened during the download, but the process continued until all 99 videos were completed download.
@deldesir @holta pi4 Calibre-Web also downloaded this playlist: https://www.youtube.com/playlist?list=PLi6Z1Wvj99SKvAGNRiDmjhgQEZ8pU-4Y2 very smoothly without any problems PUBLISHING TO URL... https://dpaste.com/H37ZJTM7R
@nzola and AFTER you've completed testing of individual playlist(s):
Try the much more ambitious experiment of downloading the entire channel β e.g. try the "Download to IIAB" button with URLs like:
@deldesir @holta I tried downloading the 2 above video links on both pi4 and multipass Calibre-Web, but they are still stuck on STARTED. pi4 Calibre-Web results: PUBLISHING TO URL... https://dpaste.com/H37ZJTM7R
multipass Calibre-Web results: PUBLISHING TO URL... https://dpaste.com/GFTD2QH9D
Downloaded this playlist:https://www.youtube.com/playlist?list=PLi6Z1Wvj99SLJRvpu2S6CPRHvECynHS0y (99 videos) with pi400 Calibre-Web. Successfully completed after 8 hours. PUBLISHING TO URL... https://dpaste.com/3LGEYAQLE [SCREENSHOT OF SUCCESSES] This error happened during the download, but the process continued until all 99 videos were completed download. [SCREENSHOTS OF FAILURE]
@deldesir were channel downloads like this working in the past?
@deldesir @holta I tried downloading the 2 above video links on both pi4 and multipass Calibre-Web, but they are still stuck on STARTED. pi4 Calibre-Web results: PUBLISHING TO URL... https://dpaste.com/H37ZJTM7R [SCREENSHOT OF "Metadata Fetch" THAT GETS STUCK "Waiting" ON RPI 4]
multipass Calibre-Web results: PUBLISHING TO URL... https://dpaste.com/GFTD2QH9D [SCREENSHOT OF "Metadata Fetch" THAT GETS STUCK "Waiting" ON 24.04 VM]
@deldesir were channel downloads like this working in the past?
@EMG70 wonders if part of the reason is...
Download stuck on "Waiting "I thought this has been normal behaviour when one tries to download another video before the first one has started or finished gathering meta data.
The long wait happens mostly on channels or extremely long playlists such that any attempt to download a second url will give the "waiting" status
(@deldesir can you help clarify?!)
This error happened during the download, but the process continued until all 99 videos were completed download.
Very ugly. Likely a network error.
@deldesir were channel downloads like this working in the past?
We did tests with channels in the past but not big as https://youtube.com/@TOPCONGOFM. This channel has 4.1k videos. Even though only 100 videos will be downloaded, the first task will take a long time processing metadata for all those 4.1k videos and sorting them out based on their views per year.
The long wait is justified as I could see all playlists being indexed running tail -f /var/log/xklb.log
. Test done in lrn2. I had to reboot the machine. I guess I'll have to move the database aside too because a lot of residual videos will be caught in subsquent downloads.
@EMG70 further explains https://github.com/iiab/calibre-web/issues/188#issuecomment-2200213122
Just to be sure we are not investigating normal behaviour I initiated a 3hr video download, before it completed gathering meta I started a 16min video download. This resulted in second video going to waiting mode and only started after first video was completed as what's happening on @nzola's channel downloads.
@EMG70 explained it well. Tasks are run sequentially. This is the normal behavior.
I was investigating the channel having a "started" message with a progress stuck at 0%.
I tried to download this video link: https://youtu.be/Zm-ROp6EMZo as requested by @holta on multipass iiab, pi4 iiab and pi400 iiab. Here are the results:
multipass iiab: PUBLISHING TO URL... https://dpaste.com/3R8RXUUZ2
pi4: PUBLISHING TO URL... https://dpaste.com/2VXWWLBZB
pi400: PUBLISHING TO URL... https://dpaste.com/8Z4TA84FH
Thanks @nzola! Let's ask @deldesir to try to fix the situation in coming days if he can!
[2024-07-01 10:02:18,264] ERROR {cps.tasks.download:128} An error occurred during the subprocess execution: 'NoneType' object is not subscriptable
Errors on Lines 1387-1391 of /var/log/xklb.log are:
2024-07-01 10:02:18 - [Debug] [https://www.youtube.com/watch?v=Zm-ROp6EMZo]: yt-dlp ERROR: ^M[download] Got error: 0 bytes read, 10050479 more expected. Giving up after 12 retries
2024-07-01 10:02:18 - [Debug] [https://www.youtube.com/watch?v=Zm-ROp6EMZo]: yt-dlp returned no info
2024-07-01 10:02:18 - [Debug] [https://www.youtube.com/watch?v=Zm-ROp6EMZo]: Recoverable error matched (will try again later). [download] Got error: 0 bytes read;10050479 more expected. Giving up after 12 retries;10416754 more expected. Retrying (5/12)...;10323314 more expected. Retrying (6/12)...;9968224 more expected. Retrying (7/12)...;10059139 more expected. Retrying (8/12)...;10278231 more expected. Retrying (9/12)...;9969513 more expected. Retrying (10/12)...;9985561 more expected. Retrying (11/12)...;10459979 more expected. Retrying (12/12)...
2024-07-01 10:02:18 - [Debug] Extra media data {'playlist_path': 'https://www.youtube.com/playlist?list=PLi6Z1Wvj99SLJRvpu2S6CPRHvECynHS0y', 'time_modified': 0, 'time_downloaded': 0, 'time_deleted': 0, 'extractor_config': '{"force": true}'}
2024-07-01 10:02:18 - [Info] lb-wrapper's xklb command (dl) completed successfully.
Tangentially related:
@nzola experienced many such errors.
Videos that were once live appear to be erroneously blocked by IIAB Calibre-Web.
Example of a video that should be downloading, but fails to download:
@nzola mentioned:
VM's iiab-diagnostics: https://dpaste.com/6H8F53GPQ
@deldesir: Any idea what's happening?