ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.66k stars 9.97k forks source link

Unable to extract course id while downloading udemy courses #22510

Closed privatejava closed 2 years ago

privatejava commented 5 years ago

Checklist

Verbose log

python ~/Documents/youtube-dl/youtube_dl/__main__.py https://companyx.udemy.com/deeplearning/ -o '%(playlist)s/%(chapter_number)s - %(chapter)s/%(playlist_index)s. %(title)s.%(ext)s' --cookies ~/Downloads/cookies.txt --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://companyx.udemy.com/deeplearning/', '-o', '%(playlist)s/%(chapter_number)s - %(chapter)s/%(playlist_index)s. %(title)s.%(ext)s', '--cookies', '/home/userx/Downloads/cookies.txt', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.09.12.1
[debug] Git HEAD: 33c1c7d80
[debug] Python version 3.6.8 (CPython) - Linux-4.15.0-62-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.6, ffprobe 3.4.6
[debug] Proxy map: {}
[udemy:course] deeplearning: Downloading webpage
[udemy:course] 1151632: Downloading course curriculum
[download] Downloading playlist: 1151632
[udemy:course] playlist 1151632: Collected 155 video ids (downloading 155 of them)
[download] Downloading video 1 of 155
[udemy] 12350046: Downloading webpage
ERROR: Unable to extract course id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/userx/Documents/youtube-dl/youtube_dl/YoutubeDL.py", line 796, in extract_info
    ie_result = ie.extract(url)
  File "/home/userx/Documents/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/home/userx/Documents/youtube-dl/youtube_dl/extractor/udemy.py", line 219, in _real_extract
    course_id, _ = self._extract_course_info(webpage, lecture_id)
  File "/home/userx/Documents/youtube-dl/youtube_dl/extractor/udemy.py", line 82, in _extract_course_info
    ], webpage, 'course id')
  File "/home/userx/Documents/youtube-dl/youtube_dl/extractor/common.py", line 1005, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract course id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

I am trying to download some videos of this course from my company's udemy account and it is totally similar to what udemy default provides but it has subdomain of company's name. For now I have changed my company to companyx . I am able to see the HTML code using chrome and I see there is big chuck of JSON data and it seems to be new HTML format (non-angularjs) unlike before . I stumbled upon the code of udemy.py there is a regex matching for ng-init which will not work in new udemy UI since it is totally different. If you want I could help on pulling out the sanitized HTML output for any specific URL.

mudasirmirza commented 4 years ago

👍 this has started happening. I think udemy has changes their URL scheme

super-sonicX commented 4 years ago

Still an issue in version 2020.03.24

Any ideas when/if this can be fixed?

dirkf commented 2 years ago

Continued in #30719.