Open ralyodio opened 9 years ago
+1 Especially for the Pro videos, -u and -p don't seem to be working
The single video page provided seems to be a Wistia-embed, which is supported. Search and playlists for the site does not seem to work as of now, though.
Here's an option... https://github.com/SimonSelg/egghead-downloader
seems to work for free courses, but how do we login to egghead for pro courses?
I cannot get courses one. I tried with https://egghead.io/courses/asynchronous-javascript-with-async-await but got ERROR: Unable to extract title
. Used version 2017.05.01.
The regex for extracting title in version 2017.05.14 is r'<h1 class="title">([^<]+)</h1>'
. But the latest update to the website mentions the title in a <span>...</span>
.
Yes - the egghead extractor needs updating to the new way the site works.
Interestingly, the course pages seem to embed a JSON representation of the lessons (it's actually JSON embedded in a script
tag with its type set to application/json
in order, seemingly, to hydrate/prime a React component when the page loads), whose format is below (I have beautified the JSON, removed any array repetition and remove keys that aren't overly relevant):
<script type="application/json" class="js-react-on-rails-component">
{
"component_name": "CourseApp",
"props": {
"course": {
"id": 115,
"duration": 2073,
"title": "Maintainable CSS using TypeStyle",
"slug": "maintainable-css-using-typestyle",
"http_url": "https://egghead.io/courses/maintainable-css-using-typestyle",
"url": "https://egghead.io/api/v1/series/maintainable-css-using-typestyle",
"lessons": [{
"id": 2050,
"title": "Add type safety to CSS using TypeStyle",
"slug": "css-add-type-safety-to-css-using-typestyle",
"duration": 253,
"series_row_order": -2097151,
"http_url": "https://egghead.io/lessons/css-add-type-safety-to-css-using-typestyle",
"url": "https://egghead.io/api/v1/lessons/css-add-type-safety-to-css-using-typestyle",
"lesson_http_url": "https://egghead.io/lessons/css-add-type-safety-to-css-using-typestyle"
}]
}
}
}
</script>
This JSON looks pretty much like the output of the public API for the course, found at https://egghead.io/api/v1/series/maintainable-css-using-typestyle.
So presumably for a given course the egghead extractor could use this public API - the lesson API responses even include the wistia ID to potentially save loading the HTML to extract it.
But even if youtube-dl doesn't like to rely on the actual API itself, it can certainly scrape the page as normal but use this JSON instead of the more likely-to-change HTML to get the title as well as the references to the individual lesson pages. Just a suggestion.
I tried to tackle this and I ended up using the public API, referring to the wistia ID's of each lesson. Check out the PR below :)
(This is my first PR and it's been a while since I coded something in Python, so sorry in advance if I get anything wrong.)
@santicalcagno looks good enough to me (though I haven't tested it!)
Egghead support seems to be broken now.
Tried to run youtube-dl https://egghead.io/lessons/javascript-create-a-native-desktop-system-menu-with-the-electron-menu-module
to test out but encountered the error messages as below:
[generic] javascript-create-a-native-desktop-system-menu-with-the-electron-menu-module: Requesting header
WARNING: Falling back on generic information extractor.
[generic] javascript-create-a-native-desktop-system-menu-with-the-electron-menu-module: Downloading webpage
[generic] javascript-create-a-native-desktop-system-menu-with-the-electron-menu-module: Extracting information
ERROR: Unsupported URL: https://egghead.io/lessons/javascript-create-a-native-desktop-system-menu-with-the-electron-menu-module
I agree that the lesson pages don't work, but FWIW the course pages do.
But yes, the lesson pages could still do with being fixed.
Yup, it's pretty straightforward to fix given the logic used for courses. Theoretically, defining a new extractor refering to the wistia ID exposed, for example, in https://egghead.io/api/v1/lessons/javascript-create-a-native-desktop-system-menu-with-the-electron-menu-module should be enough.
I'm kinda busy ATM, so if anyone wants to give this a go, by all means do so. Otherwise I should be taking this a look in a couple of weeks or so.
> youtube-dl --version
2017.09.24
> youtube-dl --verbose https://egghead.io/lessons/react-error-handling-using-error-boundaries-in-react-16
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://egghead.io/lessons/react-error-handling-using-error-boundaries-in-react-16']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2017.09.24
[debug] Python version 3.4.4 - Windows-10-10.0.15063
[debug] exe versions: none
[debug] Proxy map: {}
[egghead:lesson] react-error-handling-using-error-boundaries-in-react-16: Downloading JSON metadata
ERROR: An extractor error has occurred. (caused by KeyError('wistia_id',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp1uop9avr\build\youtube_dl\extractor\common.py", line 434, in extract
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp1uop9avr\build\youtube_dl\extractor\egghead.py", line 75, in _real_extract
KeyError: 'wistia_id'
Traceback (most recent call last):
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp1uop9avr\build\youtube_dl\extractor\common.py", line 434, in extract
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp1uop9avr\build\youtube_dl\extractor\egghead.py", line 75, in _real_extract
KeyError: 'wistia_id'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp1uop9avr\build\youtube_dl\YoutubeDL.py", line 777, in extract_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp1uop9avr\build\youtube_dl\extractor\common.py", line 447, in extract
youtube_dl.utils.ExtractorError: An extractor error has occurred. (caused by KeyError('wistia_id',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
I'm on it. I found the dash and m3u8 URLs, now all I've left is check how I can redirect ytdl to use them. Update: Current stage:
ERROR: no suitable InfoExtractor for URL dash:https://█████.cloudfront.net/javascript-redux-the-single-immutable-state-tree-█████/javascript-redux-the-single-immutable-state-tree-█████.mpd
ERROR: no suitable InfoExtractor for URL m3u8:https://█████.cloudfront.net/javascript-redux-the-single-immutable-state-tree-█████/javascript-redux-the-single-immutable-state-tree-█████.m3u8
I have the same issue. Can't download from Eggghead
ERROR: An extractor error has occurred. (caused by KeyError(u'wistia_id',))
Stilling getting the issue.
ERROR: An extractor error has occurred. (caused by KeyError(u'wistia_id',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
As @joelhooks said in #14388 (comment): @eggheadio isn't using Wistia for streaming any longer.
So this will be solved by #14388 I guess 🙂
Still not working. Will it be fixed?
@iamdubx #14388 still isn't merged so... 🙂
Still same issue.
Muhammads-MacBook-Pro:Videos mkamran$ sh download.sh create-a-news-app-with-vue-js-and-nuxt [egghead:course] create-a-news-app-with-vue-js-and-nuxt: Downloading JSON metadata ERROR: An extractor error has occurred. (caused by KeyError(u'lessons',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Muhammads-MacBook-Pro:Videos mkamran$ cat download2.sh youtube-dl --download-archive "$1/archive.txt" -o "$1/%(playlistindex)s%(title)s" "https://egghead.io/lessons/$1"
Muhammads-MacBook-Pro:Videos mkamran$ Start a Nuxt Project with npx and the Vue.js CLI Muhammads-MacBook-Pro:Videos mkamran$ sh download2.sh start-a-nuxt-project-with-npx-and-the-vue-js-cli [egghead:lesson] start-a-nuxt-project-with-npx-and-the-vue-js-cli: Downloading JSON metadata ERROR: An extractor error has occurred. (caused by KeyError(u'wistia_id',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Muhammads-MacBook-Pro:Videos mkamran$
@smkamranqadri #14388 still isn't merged so... 🙂
@MichaelDeBoey how can use that code?
@smkamranqadri: @mk-pmb linked to his PR branch 🙂
That works, but can't get it to download the best video format.
thanks but I am not python expert so don't know what to do next after cloning?
If I understand right - guys from Egghead ask guys from youtube-dl
not to fix this :smile:
@smkamranqadri, cc @mk-pmb
git checkout egghead-mediaurls-171002
then run
python -m youtube_dl https://egghead.io/lessons/react-add-redux-to-a-react-application
However, this doesn't download the best available quality.
python -m youtube_dl -F https://egghead.io/lessons/react-add-redux-to-a-react-application
[info] Available formats for react-add-redux-to-a-react-application:
format code extension resolution note
ef6e36a9-2384-45cb-901d-c827483e0fd3 mp4 1280x720 DASH video 2400k , avc1.64001f, 25fps, video only
c0f2426b-642a-47ef-bccd-c628a0db8ee4 mp4 854x480 DASH video 1200k , avc1.64001e, 25fps, video only
b1f6d4b4-614a-4745-8f06-326f7cb49f53 m4a audio only [en] DASH audio 128k , mp4a.40.2 (48000Hz) (best)
Specifying the width of the video throws a request format error.
python -m youtube_dl -f '[width=1280]' https://egghead.io/lessons/react-add-redux-to-a-react-application
[egghead:lesson] react-add-redux-to-a-react-application: Downloading MPD manifest
ERROR: requested format not available
https://egghead.io/lessons/react-add-redux-to-a-react-application
youtube-dl "https://d2c5owlt6rorc3.cloudfront.net/react-add-redux-to-a-react-application-ed6daaa8cb/react-add-redux-to-a-react-application-ed6daaa8cb.m3u8" -o react-app.mp4
[generic] react-add-redux-to-a-react-application-ed6daaa8cb: Requesting header
[generic] react-add-redux-to-a-react-application-ed6daaa8cb: Downloading m3u8 information
[download] Destination: react-app.f560.mp4
[...]
[ffmpeg] Downloaded 14086932 bytes
[download] 100% of 13.43MiB
[download] Destination: react-app.faudio_group-react-add-redux-to-a-react-application.mp4
[...]
[ffmpeg] Downloaded 4137919 bytes
[download] 100% of 3.95MiB
[ffmpeg] Merging formats into "react-app.mp4"
Deleting original file react-app.f560.mp4 (pass -k to keep)
Deleting original file react-app.faudio_group-react-add-redux-to-a-react-application.mp4 (pass -k to keep)
Video: MPEG4 Video (H264) 1280x720 25fps 429kbps [V: h264 high L3.1, yuv420p, 1280x720, 429 kb/s]
Audio: AAC 48000Hz stereo 125kbps [A: SoundHandler (aac lc, 48000 Hz, stereo, 125 kb/s)]
To tell the truth - I don't understand why Egghead are fighting so hard so nobody can download they videos. Today I will torrent all available Egghead courses on Rutracker, and I have most of them.
@0880 Yep, that's it. Thanks!
@errorsmith @0880
ERROR: requested format not available
Extraction fixed in latest version.
youtube-dl "https://egghead.io/lessons/react-error-handling-using-error-boundaries-in-react-16"
[egghead:lesson] react-error-handling-using-error-boundaries-in-react-16: Downloading JSON metadata
[egghead:lesson] 2464: Downloading MPD manifest
[egghead:lesson] 2464: Downloading m3u8 information
[dashsegments] Total fragments: 92
[download] Destination: Error Handling using Error Boundaries in React 16-2464.fdash-63abafa9-2580-4fd5-9a73-511de5dca9b8.mp4
[download] 100% of 28.24MiB in 02:12
[dashsegments] Total fragments: 92
[download] Destination: Error Handling using Error Boundaries in React 16-2464.fdash-34043f9a-4e56-4683-bb70-027dd42b37cf.m4a
[download] 100% of 5.48MiB in 01:21
[ffmpeg] Merging formats into "Error Handling using Error Boundaries in React 16-2464.mp4"
Deleting original file Error Handling using Error Boundaries in React 16-2464.fdash-63abafa9-2580-4fd5-9a73-511de5dca9b8.mp4 (pass -k to keep)
Deleting original file Error Handling using Error Boundaries in React 16-2464.fdash-34043f9a-4e56-4683-bb70-027dd42b37cf.m4a (pass -k to keep)
still same
Muhammads-MacBook-Pro:youtube-dl-rg3 mkamran$ python -m youtube_dl "https://egghead.io/lessons/react-error-handling-using-error-boundaries-in-react-16" [egghead:lesson] react-error-handling-using-error-boundaries-in-react-16: Downloading JSON metadata WARNING: [egghead:lesson] Cannot find an proper ID, will use lesson name URL slug [egghead:lesson] react-error-handling-using-error-boundaries-in-react-16: Downloading MPD manifest ERROR: requested format not available
no change receive on pull
https://github.com/mk-pmb/youtube-dl-rg3.git
technology list: https://egghead.io/technologies/angular2 series list: https://egghead.io/series/react-flux-architecture search results list: https://egghead.io/search?q=testing
video page: https://egghead.io/lessons/react-development-environment-setup