coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 639 forks source link

No sections/videos found #494

Open illuzioner opened 6 years ago

illuzioner commented 6 years ago

🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.

Subject of the issue

edx-dl cannot find any downloadable content for the course "Online Marketing Strategies" (url: https://courses.edx.org/courses/course-v1:CurtinX+MKT5x+2T2018/course/)

It gives the error:

Downloading Online Marketing Strategies [course-v1:CurtinX+MKT5x+2T2018/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found.

Content does exist, however. But it seems to be in an unusual format compared to other courses. For example, video content is on this page:

https://courses.edx.org/courses/course-v1:CurtinX+MKT5x+2T2018/courseware/8b27b66e2ec546268b2083fccc2353e6/e2ec0da9a9c6487aafdd16016d4cf449/?child=first

Your environment

Steps to reproduce

download the course: https://courses.edx.org/courses/course-v1:CurtinX+MKT5x+2T2018/course/

Expected behaviour

It should download all the videos etc.

Actual behaviour

It gives the above error.

monicamelchor commented 6 years ago

Hi!

I get the same error--double checked earlier threads and the parsing.py file is the most updated one. A bit of a newbie so not sure how to correct this--it seems like it's related to the updated edX structure. Would really appreciate any help!

More details:

Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Email or password is incorrect.

: edx-dl -u https://courses.edx.org/courses/course-v1:ASUx+HST102x+2181B/course/ edx_dl version 0.1.7 Password: Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Downloading Western Civilization: Ancient and Medieval Europe [course-v1:ASUx+HST102x+2181B/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found. Operating system: macOS High Sierra version 10.13.1 edx-dl version: 0.1.7
balta2ar commented 6 years ago

Could you try using edx from the master branch, not from pip? Maybe this pull request helps: https://github.com/coursera-dl/edx-dl/pull/489

monicamelchor commented 6 years ago

Hi @balta2ar, thanks a lot for replying! Very new to this--what do you mean by using edX from the master branch instead of from pip?

Does that mean running the following?: python edx-dl -u user@user.com -p https://courses.edx.org/courses/course-v1:ASUx+HST102x+2181B/course/

balta2ar commented 6 years ago

No, I mean grabbing the source code from this repository directly instead of installing a package from Pip, i.e. manually cloning the repo and running the script from there, something along these lines:

git clone https://github.com/coursera-dl/edx-dl
cd edx-dl
./edx-dl ...
satishdesh commented 6 years ago

Hello balta2ar I am also having same problem sometimes like monica. So i tried your guideline but still getting error

TypeError 'NoneType' object is not subscriptable

Though i have registered for the course and had gone through some videos. So what is the solution for this now?

balta2ar commented 6 years ago

@satishdesh Sorry to disappoint you guys, but not solution at the moment. Gotta wait until me or rbrito or someone else finds spare time to fix the issue. Meanwhile, please report what course you're trying to download (if it's a different course).

satishdesh commented 6 years ago

Hello balta2ar Thanks for the quick reply. I tried for few free courses but every time i am getting same error. It seems there are some changes in edx site. So we will just wait for the fixing. I raised the issue #493 for this course https://courses.edx.org/courses/course-v1:GalileoX+CAAD004X+1T2018a/course/ But it seems there are same issues for any registered subscribed courses.

monicamelchor commented 6 years ago

Hi balta2ar, thanks a lot. Tried to use the edX from the master branch but couldn’t get it to work either. Does it make a difference if youtube-dl or other requirements were downloaded via pip but the edX-dl runs through masterbranch?

I’m trying to download this course in particular but it’s archived: https://courses.edx.org/courses/UTAustinX/UT.2.02x/3T2014/course/

llpj commented 6 years ago

@monicamelchor Verify if parsing.py in your python dir > Lib\site-packages\edx-dl\ has following changes: https://github.com/coursera-dl/edx-dl/commit/6b04c1a08bd8e451c2ac06b0e4cf2719a00067c9

balta2ar commented 6 years ago

Does it make a difference if youtube-dl or other requirements were downloaded via pip but the edX-dl runs through masterbranch?

No, installing youtube-dl from PIP should be OK and shouldn't be causing problems. It's edx-dl, I think and the new layout.

netship02 commented 6 years ago

@balta2ar works for me with master branch

monicamelchor commented 6 years ago

Hi @netship02 @balta2ar, it worked with the master branch (sorry, new to this). I figured out the problem was I wasn't entering my password properly in the code. Thanks a lot!! :)

monicamelchor commented 6 years ago

Just in case this is helpful in the future for anyone else who is also very new to this, this is what worked for me:

git clone https://github.com/coursera-dl/edx-dl cd edx-dl pip install -r requirements.txt ./edx-dl.py - u -p

satishdesh commented 6 years ago

Hello @netship02 @balta2ar @monicamelchor Nothing worked for me @llpj Yes, that file is not updated to 6b04c1a. after updating edx-dl this change is not happening so this file is not merged yet with the update ? or something is missing at my end ?

netship02 commented 6 years ago

Seems bug : All files are downloading multiple time like 4 or 5 times.

llpj commented 6 years ago

@satishdesh Try uninstalling then reinstalling using pip, it will work but videos are downloading 4 times as reported above.

anmolsahoo25 commented 6 years ago

Could you guys clone my repo and try it out?

I am actually working on a fix and would require some feedback.

git clone https://github.com/anmolsahoo25/edx-dl
git checkout fix-0-courses-available
./edx-dl.py

And let me know if it is working or not.

vattybear commented 6 years ago

@anmolsahoo25 - the version from that repo seems to work but version reported is now 0.1.16? The latest version seems to be 0.1.17 which doesn't work (I get "No downloadable video found" with 0.1.17 and this course : course-v1:AWS+OTP-AWSD1+1T2018/course/) I was able to download that with your version

ithenis commented 6 years ago

Hi, I downloaded this project today and installed it. I can't get it to download anything for my EDX courses

dvrny commented 6 years ago

Hello - I'm getting the Downloading 0 section(s) issue on the Microsoft Azure App Service course: https://courses.edx.org/courses/course-v1:Microsoft+AZURE206x+1T2018/course/

I uninstalled and reinstalled after cloning maser. I'm a python and git newb so please let me know if this should be working on this course and if so, i'll investigate my install.

Windows 10 1709 Python 3.6.4 edx-dl 0.1.7 youtube-dl 2018.03.20

rbrito commented 6 years ago

I uploaded a new version to PyPI. I will be bold and remove any installation instructions that mention pip but don't include a virtual environment, BTW.

Oh, while I am on it, I will also become less involved with this project, precisely because of things like https://github.com/coursera-dl/edx-dl/issues/490#issuecomment-374038665.

I guess that @iemejia and @balta2ar will understand...

rbrito commented 6 years ago

It seems that people are discussing two different things here:

1 - No videos being found. Hopefully, version 0.1.8 fixes this (perhaps with the pull request that @balta2ar mentioned). 2 - Videos being downloaded multiple times. This is being tracked at https://github.com/coursera-dl/edx-dl/issues/487

Please, let's focus here on videos not being found.

illuzioner commented 6 years ago

@rbrito,

I understand your frustration with comments like that which come out of nowhere. Please don't let them get to you. All of you who contribute to this project are doing a valuable service. It's not for the ingrates, quick-to-judge or ill-informed that you have done this, work but for those who are needful and grateful, like me and the many other learners who find this project very valuable.

While I am in the US, there are learners around the world who have spotty internet connections who rely on the work that you've done.

So, while you may need to step down for other reasons, please don't do it because you think nobody appreciates what you do. Sometimes we get frustrated with software. We all do. However, I do appreciate all the great work you and your cohorts have put into this very valuable project.

Thank you for all your great work!

iceberg53 commented 6 years ago

I completely agree with @illuzioner . Thank you for such a great work .

iceberg53 commented 6 years ago

Thank you very much @rbrito for everything you did and are still doing for people all around the world.

rbrito commented 6 years ago

I will release a new version in a few hours and you can update the program to the newest version.

rbrito commented 6 years ago

Is this bug fixed with the new version? I would like to close this bug once it is confirmed to be fixed in the new version.

To update to the latest version, use pip install -U edx-dl.

ithenis commented 6 years ago

hello @rbrito,

I've installed the updated version and tried to download 2 of my courses: first course (Synchrotrons and X-Ray Free Electron Lasers) seems to have downloaded the material (although some the pdf files don't open) and at the end it throws this error: urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)

second course (Mastering Quantum Mechanics Part I: Wave Mechanics) only downloaded the introductory video, nothing more. It says: urllib.error.HTTPError: HTTP Error 403: Forbidden

I'm already happy with what I've got (I needed the first course material for work), so thank you for the update. If it's possible to make it work with the other courses to, it would be fantastic.

balta2ar commented 6 years ago

although some the pdf files don't open urllib.error.HTTPError: HTTP Error 403: Forbidden

I've seen similar symptoms when I tried downloading anything using edx-dl last time. Of course I can't know for sure, but to me it looks like edx is fed up with our scripts and just blocks by user agent (maybe depending on the course or on the exact CDN instance, I don't know). I'm afraid faking user agent may be the only option...

ithenis commented 6 years ago

I'm afraid I may not possess the necessary savviness to do that; I had to google what CDN and user agent is ... that's never a good sign :))

zkazsi commented 6 years ago

Unfortunately the new release still doesn't fix the downloading issue for me on mitxpro. The courses are listed correctly with '--list-courses' switch, but the download does not happen

Complete message:

Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://mitxpro.mit.edu/login_ajax Extracting course information from dashboard. Downloading Quantitative Methods in Systems Engineering [course->v1:MITxPRO+SysEngx4+3T2017/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found.

odyslam commented 6 years ago

Same problem with edx edge

Zhijun166 commented 6 years ago

urllib.error.URLError: <urlopen error Remote end closed connection without respo

Zhijun166 commented 6 years ago

I'm from China, I use this vpn agent. I can retrieve the catalog video but it shows “urllib.error.URLError: <urlopen error Remote end closed connection without response” At the bottom

sqageek commented 5 years ago

Not able to download any videos ..getting the below message.. Just today installed edx-dl.

edx_dl version 0.1.5 Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Downloading Computing in Python I: Fundamentals and Procedural Programming [course-v1:GTx+CS1301xI+1T2019/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found.