coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.92k stars 638 forks source link

HTTP Error 403: Forbidden #636

Closed rallantan closed 3 years ago

rallantan commented 3 years ago

🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.

I am getting HTTP Error 403: Forbidden issue when downloading from Edx. I did make the change in the dll

Your environment

Steps to reproduce

Tell us how to reproduce this issue. Please provide us the course URL, and the specific subsection or unit if possible.

I am management student trying to download the below course https://courses.edx.org/courses/course-v1:BUx+QD501x+2T2020/course/

Expected behaviour

Download the course.

Actual behaviour

C:\windows\system32>edx-dl -u username@gmail.com https://courses.edx.org/courses/course-v1:BUx+QD501x+2T2020/course/ edx_dl version 0.1.13 Password: Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Traceback (most recent call last): File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\rajal\AppData\Local\Programs\Python\Python38-32\Scripts\edx-dl.exe__main__.py", line 7, in File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 1020, in main all_selections = {selected_course: File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 1021, in get_available_sections(selected_course.url.replace('info', 'course'), File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 184, in get_available_sections page = get_page_contents(url, headers) File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\utils.py", line 58, in get_page_contents result = urlopen(Request(url, None, headers)) File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 531, in open response = meth(req, response) File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 640, in http_response response = self.parent.error( File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 569, in error return self._call_chain(args) File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 502, in _call_chain result = func(args) File "c:\users\rajal\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

vishusgh commented 3 years ago

Hi, I am using GitHub first time, basically am a noob but I guess I could help you in this. I faced the same problem & am pasting solution given by other person.

You can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work.

if you are using chrome then you could change it to 'Chrome/84.0.4147.125'. Note :

  1. here 84.0.4147.125 is chrome version, get yours at chrome://settings/help
  2. Search "edx_dl" in the file directory, open folder. Then open file 'edx_dl.py' in text editor & scroll till line 425. See line no. at bottom right corner of the editor.
suvmpr commented 3 years ago

Hi, I am using GitHub first time, basically am a noob but I guess I could help you in this. I faced the same problem & am pasting solution given by other person.

You can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work.

if you are using chrome then you could change it to 'Chrome/84.0.4147.125'. Note :

1. here 84.0.4147.125 is chrome version, get yours at chrome://settings/help

2. Search "edx_dl" in the file directory, open folder. Then open file 'edx_dl.py' in text editor & scroll till line 425. See line no. at bottom right corner of the editor.

Im having the same issue, i tried that method but didnt work

MATRIX30 commented 3 years ago

Facing the same issue too with this course https://courses.edx.org/courses/course-v1:IsraelX+infosec101+3T2019a/course/

maggielovedd commented 3 years ago

Hi, I am using GitHub first time, basically am a noob but I guess I could help you in this. I faced the same problem & am pasting solution given by other person.

You can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work.

if you are using chrome then you could change it to 'Chrome/84.0.4147.125'. Note :

1. here 84.0.4147.125 is chrome version, get yours at chrome://settings/help

2. Search "edx_dl" in the file directory, open folder. Then open file 'edx_dl.py' in text editor & scroll till line 425. See line no. at bottom right corner of the editor.

Thanks so much, it works for me. I am using FireFox 79.0.

suvmpr commented 3 years ago

Hi, I am using GitHub first time, basically am a noob but I guess I could help you in this. I faced the same problem & am pasting solution given by other person. You can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work. if you are using chrome then you could change it to 'Chrome/84.0.4147.125'. Note :

1. here 84.0.4147.125 is chrome version, get yours at chrome://settings/help

2. Search "edx_dl" in the file directory, open folder. Then open file 'edx_dl.py' in text editor & scroll till line 425. See line no. at bottom right corner of the editor.

Thanks so much, it works for me. I am using FireFox 79.0.

what did you change exactly?

maggielovedd commented 3 years ago

Thanks so much, it works for me. I am using FireFox 79.0.

what did you change exactly?

'User-Agent': 'Mozilla/5.0'

rallantan commented 3 years ago

Thank a ton

It is working thank you all

Laddo481 commented 3 years ago

Hi guys I am having the same/similar problem. Kindly help. Below is what shows when I run my cmd command. Thank you


Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Traceback (most recent call last): File "c:\python3.9\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\python3.9\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Python3.9\Scripts\edx-dl.exe__main__.py", line 7, in File "c:\python3.9\lib\site-packages\edx_dl\edx_dl.py", line 1020, in main all_selections = {selected_course: File "c:\python3.9\lib\site-packages\edx_dl\edx_dl.py", line 1021, in get_available_sections(selected_course.url.replace('info', 'course'), File "c:\python3.9\lib\site-packages\edx_dl\edx_dl.py", line 184, in get_available_sections page = get_page_contents(url, headers) File "c:\python3.9\lib\site-packages\edx_dl\utils.py", line 58, in get_page_contents result = urlopen(Request(url, None, headers)) File "c:\python3.9\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "c:\python3.9\lib\urllib\request.py", line 523, in open response = meth(req, response) File "c:\python3.9\lib\urllib\request.py", line 632, in http_response response = self.parent.error( File "c:\python3.9\lib\urllib\request.py", line 561, in error return self._call_chain(args) File "c:\python3.9\lib\urllib\request.py", line 494, in _call_chain result = func(args) File "c:\python3.9\lib\urllib\request.py", line 641, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden