coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.92k stars 638 forks source link

Edx http error 403: Forbidden #669

Open mugeshk97 opened 3 years ago

mugeshk97 commented 3 years ago

🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.

Subject of the issue

Describe your issue here.

Your environment

Steps to reproduce

Tell us how to reproduce this issue. Please provide us the course URL, and the specific subsection or unit if possible. https://courses.edx.org/courses/course-v1:MITx+6.00.1x+1T2021/course/

Expected behaviour

Tell us what should happen. raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

Actual behaviour

Tell us what happens instead. If the script fails, please copy the entire output of the command or the stacktrace (don't forget to obfuscate your username and password). If you cannot copy the exception, attach a screenshot. image

mugeshk97 commented 3 years ago

It didn't worked

felixoldsoul commented 3 years ago

Similar issue at my side urllib.error.HTTPError: HTTP Error 403: Forbidden Likely html change?

pazalinio commented 3 years ago

Getting the same issue with the change in the URL. I'm trying to dl

https://learning.edx.org/course/course-v1:USMx+ENCE607.4x+1T2021/home

Onimir89 commented 3 years ago

I can't download the course because I get http error 403: Forbidden.

Environment: windows 10 64 bit [[Versione 10.0.19042.867]] Python version:3.9 youtube-dl version: lattest version at3/21/2021 edx-dl version:0.1.13

Steps to reproduce:

edx-1 edx-2

Terr commented 3 years ago

It seems that the 403 is caused by the user agent being sent with the HTTP requests.

If I change it to something realistic (like Mozilla/5.0 (X11; Linux x86_64; rv:83.0) Gecko/20100101 Firefox/83.0) the error goes away. So edX is actively trying to block this program.

However I'm running into #670 after the change, so downloading videos doesn't work yet.

ribvl commented 3 years ago

I managed getting rid of this error by substituting 'User-Agent': 'edX-downloader/0.01' by 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36' on the header from edx_dl.py located in Python37\Lib\site-packages\edx_dl\edx_dl.py (line 425)

The script created the folders as expected, but there are no files inside any of them, though. It appears to be the same error from Issue #670. Did anyone find a solution?

rowatc commented 3 years ago

following: same error for MITx' 6.86x

nerobianchi commented 3 years ago

Same error

floviolleau commented 3 years ago

Checkout this project: https://github.com/rehmatworks/edx-downloader

He put a cool feature, take a random user agent from a list https://fake-useragent.herokuapp.com/browsers/0.1.11

Kind regards