Closed MKSherbini closed 3 years ago
I did try to check the links, they do give 404 from a browser.
That means the links are dead. Test with a known working link.
@Ashish0804 I think they meant the video links given by yt-dlp gave a 404. The playlist link seem to be fine
@MinePlayersPE If you have an subscription, then can u test the command given by OP so we are sure it's a broken site?
The link works fine, I meant the items fetched by the link seem to give 404, the individual videos I mean. I did test multiple courses I downloaded just 2 days ago, but it does not work anymore
Also to add, before I gave credentials it could fetch 1min of each video, so even in the logged-out state it could download 1min, now it can't even find the video
It's seems to be working for me
yt-dlp -F https://learning.oreilly.com/videos/learning-path-delivering/9781491989012/
[safari:course] 9781491989012: Downloading course JSON
[download] Downloading playlist: Learning Path: Delivering Applications with Docker
[safari:course] playlist Learning Path: Delivering Applications with Docker: Collected 74 videos; downloading 74 of them
[download] Downloading video 1 of 74
[safari:api] 9781491989012/video301824: Downloading part JSON
[safari] 9781491989012-video301824: Downloading webpage
[Kaltura] 9781491989012-video301824: Downloading webpage
[Kaltura] 0_wuyu5ime: Downloading video info JSON
[Kaltura] 0_wuyu5ime: Checking mp4-409 URL
[Kaltura] 0_wuyu5ime: Downloading m3u8 information
WARNING: [Kaltura] Ignoring subtitle tracks found in the HLS manifest; if any subtitle tracks are missing, please report this issue on https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see https://github.com/yt-dlp/yt-dlp on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
[info] Available formats for 0_wuyu5ime:
ID EXT RESOLUTION FPS | FILESIZE TBR PROTO | VCODEC VBR ACODEC ABR MORE INFO
------- --- ---------- --- - ---------- ----- ------ - ------- ----- --------- ---- ---------
mp4-56 mp4 audio only 0 | ~689.00KiB 56k http | unknown 56k isom
hls-58 mp4 audio only | 58k m3u8_n | mp4a.40.2 58k
hls-230 mp4 640x360 | 230k m3u8_n | unknown 230k unknown 0k
hls-252 mp4 640x360 | 252k m3u8_n | unknown 252k unknown 0k
mp4-199 mp4 640x360 29 | ~2.37MiB 199k http | avc1 199k unknown 0k isom
mp4-218 mp4 640x360 29 | ~2.60MiB 218k http | avc1 218k unknown 0k isom
mp4-220 mp4 640x360 29 | ~2.63MiB 220k http | avc1 220k unknown 0k isom
hls-433 mp4 1280x720 | 433k m3u8_n | unknown 433k unknown 0k
mp4-386 mp4 1280x720 29 | ~4.60MiB 386k http | avc1 386k unknown 0k isom
mp4-393 mp4 1280x720 29 | ~4.68MiB 393k http | avc1 393k unknown 0k isom
mp4-409 mp4 1280x720 29 | ~4.87MiB 409k http | avc1 409k unknown 0k mp42
[download] Downloading video 2 of 74
[safari:api] 9781491989012/video301825: Downloading part JSON
[safari] 9781491989012-video301825: Downloading webpage
@Ashish0804 that's true, I tested a bit, it seems adding the cookies file causes this 404 error, but I don't get how that could be possible when it's available for the public anyway
Try getting latest cookies from the browser (they could have expired) Also make sure u are able to view/play the files when logged in...as u mentioned u have used this before so depending on how much u were downloading, they could have blocked your account.
If the problem still isn't solved, then you will have to provide the account since i can't reproduce the issue.
You are right, my browser seems to have cached something that lets me access the content still, but when trying to log in from another browser this wasn't the case, I'll solve my account's issues then, thanks.
@Ashish0804 After re-checking with multiple others, I can confirm the issue is from yt-dlp, I can access the site normally and open all videos (The same for my friends), but can't use cookies anymore. As for the Oreilly account, you can create any temp account and use the free trial.
@pukkandan Did creating a new temp account fail somehow? I can help debugging too just lemme know the related parts in code
The extractor builds the wrong individual video link
This is what the extractor builds:
https://learning.oreilly.com/library/view/getting-started-with/9781787285491/video1_1.html
This is what it should be
https://learning.oreilly.com/videos/getting-started-with/9781787285491/9781787285491-video1_1/
@hasantayyar Thanks, I submitted a PR to handle this change here. As I didn't read the rest of the code I can't confirm if it creates other issues, but now I can download again without issues.
After some testing, this just bypassed the 404-issue, even after using the right URL, it only downloads 3min from each video as if not authenticated
[debug] Custom config: ['--cookies-from-browser', 'firefox', '--download-archive', 'archive.txt', '-i', '-c', '-v', '--console-title', '--batch-file=batch-file.txt', '--write-annotations', '--write-description', '--write-info-json', '--write-thumbnail', '--sub-lang', 'en', '--write-auto-sub', '--write-sub', '--add-metadata', '--embed-subs', '--embed-thumbnail', '-o', '%(playlist_title)s/%(playlist_index)s. %(title)s.%(ext)s', '-f', 'bestvideo[height<=720]+bestaudio/best[height<=720]/worst', '--merge-output-format', 'mp4']
[debug] Command-line config: ['--config-location', 'D:\\Utils/YoutubeDL/Configs/ytdlp_oreilly.conf', '--playlist-items', '2']
[debug] Batch file urls: ['https://learning.oreilly.com/videos/the-principles-of/9781491935811/']
[Cookies] Extracting cookies from firefox
[debug] Extracting cookies from: "C:\Users\mh-sh\AppData\Roaming\Mozilla\Firefox\Profiles\8tceuger.default-release\cookies.sqlite"
[Cookies] Extracted 1548 cookies from firefox
[debug] Loading archive file 'archive.txt'
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] yt-dlp version 2021.09.02 (source)
[debug] Plugin Extractors: ['SamplePlugin']
[debug] Git HEAD: 982323fe1
[debug] Python version 3.9.1 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg n4.4-4-gacb339bb88, ffprobe n4.4-4-gacb339bb88
[debug] Optional libraries: sqlite
[debug] Proxy map: {}
[debug] [safari:course] Extracting URL: https://learning.oreilly.com/videos/the-principles-of/9781491935811/
[safari:course] 9781491935811: Downloading course JSON
[download] Downloading playlist: The Principles of Microservices
[info] Writing playlist metadata as JSON to: The Principles of Microservices\0. The Principles of Microservices.info.json
WARNING: There's no playlist description to write.
[safari:course] playlist The Principles of Microservices: Collected 15 videos; downloading 1 of them
[download] Downloading video 1 of 1
[debug] [safari:api] Extracting URL: https://learning.oreilly.com/api/v1/book/9781491935811/chapter/video221406.html
[safari:api] 9781491935811/video221406: Downloading part JSON
[debug] [safari] Extracting URL: https://learning.oreilly.com/videos/the-principles-of/9781491935811/9781491935811-video221406/
[debug] [Kaltura] Extracting URL: https://cdnapisec.kaltura.com/html5/html5lib/v2.37.1/mwEmbedFrame.php?wid=_1926081&uiconf_id=29375172&flashvars%5BreferenceId%5D=9781491935811-video221406
[Kaltura] 9781491935811-video221406: Downloading webpage
[Kaltura] 0_1i6jb4o4: Downloading video info JSON
[Kaltura] 0_1i6jb4o4: Checking mp4-4468 URL
[Kaltura] 0_1i6jb4o4: Downloading m3u8 information
WARNING: [Kaltura] Ignoring subtitle tracks found in the HLS manifest; if any subtitle tracks are missing, please report this issue on https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see https://github.com/yt-dlp/yt-dlp on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, vcodec:vp9.2(10), acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id
[debug] Downloading subtitles: en
[info] 0_1i6jb4o4: Downloading 1 format(s): mp4-1966
WARNING: There's no description to write.
WARNING: There are no annotations to write.
[info] Writing video subtitles to: The Principles of Microservices\2. What are Microservices.en.ttml
[debug] Invoking downloader on "http://cdnapi.kaltura.com/api_v3/service/caption_captionasset/action/serve/captionAssetId/0_diyhxwtn"
[download] The Principles of Microservices\2. What are Microservices.en.ttml has already been downloaded
[download] 100% of 15.08KiB
[info] Writing video metadata as JSON to: The Principles of Microservices\2. What are Microservices.info.json
[Kaltura] 0_1i6jb4o4: Downloading thumbnail ...
[Kaltura] 0_1i6jb4o4: Writing thumbnail to: The Principles of Microservices\2. What are Microservices.jpg
[debug] Invoking downloader on "http://cdnapi.kaltura.com/p/1926081/sp/192608100/playManifest/entryId/0_1i6jb4o4/format/url/protocol/http/flavorId/0_coc8hy0d"
[download] Resuming download at byte 2717283
[download] Destination: The Principles of Microservices\2. What are Microservices.mp4
[download] 7.6% of 35.87MiB at 489.40KiB/s ETA 01:09
The issue now is that the downloader is invoked on "http://cdnapi.kaltura.com/p/1926081/sp/192608100/playManifest/entryId/0_1i6jb4o4/format/url/protocol/http/flavorId/0_coc8hy0d" which is only 3min, but it already had access to the full video at "https://cdnapisec.kaltura.com/html5/html5lib/v2.37.1/mwEmbedFrame.php?wid=_1926081&uiconf_id=29375172&flashvars%5BreferenceId%5D=9781491935811-video221406"
@hasantayyar Thanks, I submitted a PR to handle this change here. As I didn't read the rest of the code I can't confirm if it creates other issues, but now I can download again without issues.
Thanks @MKSherbini I will test with a subscription.
The issue is the O'Reilly api responds the wrong web url and I think your change is the only way to fix it for now until they changed.
I just installed with python3 -m pip install --upgrade git+https://github.com/MKSherbini/yt-dlp
and was able
to download again with a free trial account.
~~BUT all videos are truncated after 60 seconds.' When using the same account in their webplayer it's possible to watch the videos beyond the 60s~~
Update: I think the trunkated videos happened because I still was using an (old) cookie.txt and a useragent option taken from earlier tries to get yt-dlp and safari working. I am sorry for the confusion.
I now can confirm that MKSherbini is working for me, even with a trial account! Thanks MKSherbini!
@MKSherbini I tested this with my credentials both with individual video page and course page. It's downloading the videos without issues
I am confused by this conversation. Does #990 download the truncated video, or the full video?
pukkandan, I am sorry for the confusion, as far as I can tell now the fix works just fine! I updated my original comment https://github.com/yt-dlp/yt-dlp/issues/586#issuecomment-921203603-permalink
With latest update getting this issue.
[safari:course] 9780136787709: Downloading course JSON
[download] Downloading playlist: Getting Started with Kubernetes LiveLessons, 2nd Edition
[safari:course] playlist Getting Started with Kubernetes LiveLessons, 2nd Edition: Collected 87 videos; downloading 87 of them
[download] Downloading video 1 of 87
[safari:api] 9780136787709/GSK2_00_00_00: Downloading part JSON
ERROR: no suitable InfoExtractor for URL https:/9780136787709-GSK2_00_00_00
[download] Downloading video 2 of 87
[safari:api] 9780136787709/GSK2_01_01_00: Downloading part JSON
ERROR: no suitable InfoExtractor for URL https:/9780136787709-GSK2_01_01_00
[download] Downloading video 3 of 87
[safari:api] 9780136787709/GSK2_01_01_01: Downloading part JSON
ERROR: no suitable InfoExtractor for URL https:/9780136787709-GSK2_01_01_01
[download] Downloading video 4 of 87
[safari:api] 9780136787709/GSK2_01_01_02: Downloading part JSON
ERROR: no suitable InfoExtractor for URL https:/9780136787709-GSK2_01_01_02
[download] Downloading video 5 of 87
[safari:api] 9780136787709/GSK2_01_01_03: Downloading part JSON
I see what the issue is. Thanks for the catch. Will fix it when Iget on my PC
Checklist
Verbose log
Description
These same configs worked just 2 days ago, now I can't download any Oreilly content, I did try to check the links, they do give 404 from a browser.