ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.68k stars 9.97k forks source link

ERROR: Unable to download webpage: HTTP Error 404: Not Found https://www.udemy.com/ #31161

Open claytonq opened 2 years ago

claytonq commented 2 years ago
$ youtube-dl --verbose --cookies ./cookies.txt https://www.udemy.com/course-dashboard-redirect/?course_id=XXXXXX
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', '--cookies', './cookies.txt', 'https://www.udemy.com/course-dashboard-redirect/?course_id=XXXXXX']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.10.5 (CPython) - Linux-5.18.0-3-amd64-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 5.1-2, ffprobe 5.1-2, rtmpdump 2.4
[debug] Proxy map: {}
[udemy:course] course-dashboard-redirect: Downloading webpage
[udemy:course] XXXXXX: Downloading course curriculum
[download] Downloading playlist: XXXXXX
[udemy:course] playlist XXXXXX: Collected 30 video ids (downloading 30 of them)
[download] Downloading video 1 of 30
[udemy] 22604176: Downloading webpage
ERROR: Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure yo
u are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/usr/lib/python3/dist-packages/youtube_dl/extractor/common.py", line 634, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3/dist-packages/youtube_dl/YoutubeDL.py", line 2288, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
dirkf commented 2 years ago

I suspect that the failing code is trying to construct a lesson URL like 'https://www.udemy.com/%s/learn/v4/t/lecture/%s' % (course_path, entry['id']), and that this format is no longer correct.

As well as the unredacted course URL, please provide the lesson URL for the first lesson.

You could also use --write-pages and post the zipped *.dump files (HTML and JSON sent by the site). This could however include data like your course username that you might prefer not to publish, so you should either redact the files or arrange another way to transfer the package.

Also this.