coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 639 forks source link

HTTP Error 403: Forbidden #664

Open wellwellwellgithub opened 3 years ago

wellwellwellgithub commented 3 years ago

Subject of the issue

When I try to download a course it comes up with 'forbidden' & errors throughout the script. Everything before this step works perfectly. I can load the 'list courses' section with no problem. (I have a very limited understanding of code, so a detailed explanation of how to fix this would be very appreciated).

My environment

Operating System (name/version): Windows 10 Pro 64-bit Python version: Python 3.9.1 youtube-dl version: 2021.2.4.1 edx-dl version: 0.1.13

Steps to reproduce

the course is: https://courses.edx.org/courses/course-v1:UniversityofCambridge+2021edx001+1T2021/course/ the code is the usual edx-dl -u user@user.com COURSE_URL

Expected behaviour

For the course to download.

Actual behaviour

C:\edx-dl-master>edx-dl -u username@user.com https://courses.edx.org/courses/course-v1:UniversityofCambridge+2021edx001+1T2021/course/ edx_dl version 0.1.13 Password: Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Traceback (most recent call last): File "c:\users\user\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\user\appdata\local\programs\python\python39\lib\runpy.py", line 87, in run_code exec(code, run_globals) File "C:\Users\user\AppData\Local\Programs\Python\Python39\Scripts\edx-dl.exe_main.py", line 7, in File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\edx_dl\edx_dl.py", line 1020, in main all_selections = {selected_course: File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\edx_dl\edx_dl.py", line 1021, in get_available_sections(selected_course.url.replace('info', 'course'), File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\edx_dl\edx_dl.py", line 184, in get_available_sections page = get_page_contents(url, headers) File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\edx_dl\utils.py", line 58, in get_page_contents result = urlopen(Request(url, None, headers)) File "c:\users\user\appdata\local\programs\python\python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "c:\users\user\appdata\local\programs\python\python39\lib\urllib\request.py", line 523, in open response = meth(req, response) File "c:\users\user\appdata\local\programs\python\python39\lib\urllib\request.py", line 632, in http_response response = self.parent.error( File "c:\users\user\appdata\local\programs\python\python39\lib\urllib\request.py", line 561, in error return self._call_chain(*args) File "c:\users\user\appdata\local\programs\python\python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "c:\users\user\appdata\local\programs\python\python39\lib\urllib\request.py", line 641, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

walid-dotcom commented 3 years ago

same problem with this course https://courses.edx.org/courses/course-v1:StanfordOnline+GSE-YEDUC115-S+1T2020/course/

Nudin commented 3 years ago

Got the same issue. Here's the stack trace formatted nicely.

Traceback (most recent call last):
  File "/usr/bin/edx-dl", line 33, in <module>
    sys.exit(load_entry_point('edx-dl==0.1.13', 'console_scripts', 'edx-dl')())
  File "/usr/bin/edx-dl", line 22, in importlib_load_entry_point
    for entry_point in distribution(dist_name).entry_points
  File "/usr/lib/python3.9/importlib/metadata.py", line 524, in distribution
    return Distribution.from_name(distribution_name)
  File "/usr/lib/python3.9/importlib/metadata.py", line 187, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: edx-dl

From the debugger at the following line: result = urlopen(Request(url, None, headers))

url = 'https://courses.edx.org/courses/ANUx/ANU-ASTRO2x/2T2014/course/'

headers = {'User-Agent': 'edX-downloader/0.01', 'Accept': 'application/json, text/javascript, */*; q=0.01', 'Content-Type': 'application/x-www-form-urlencoded;charset=utf-8', 'Referer': 'https://courses.edx.org/user_api/v1/account/login_session', 'X-Requested-With': 'XMLHttpRequest', 'X-CSRFToken': 'a0beoP[EDITED]pGv1O'}
maxicuevas commented 3 years ago

same problem with this course https://courses.edx.org/courses/course-v1:RiceX+BIOC300.2x+1T2019/course/

rehmatworks commented 3 years ago

@maxicuevas I have just launched a downloader that initially I wrote for my own use. You can try it out at and I hope you will find it interesting and useful:

I hope it helps the community!

perevales commented 3 years ago

Same problem. @rehmatworks solution works like a charm

paulodaguero commented 3 years ago

@maxicuevas I have just launched a downloader that initially I wrote for my own use. You can try it out at and I hope you will find it interesting and useful:

I hope it helps the community!

Thanks a lot mate! You are a legend! One question: is it possible to download with subtitles as well? One remark: if you have downloaded videos on your phone previously, the downloader will know and avoid these videos. Just for info in case you did not know. Cheers!