coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 638 forks source link

(Solved) "No downloadable video found." #486

Closed Bigemul closed 6 years ago

Bigemul commented 6 years ago

🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.

Subject of the issue

Hi

When I try to down vidéos, I get:

"Downloading CS50's Introduction to Computer Science [course-v1:HarvardX+CS50+X/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found."

Your environment

Steps to reproduce

edx-dl -u myadress@gmail.com https://courses.edx.org/courses/course-v1:HarvardX+CS50+X/course/ -o /Volumes/Test/e-learning/Edx

Expected behaviour

I'd like something to download

Actual behaviour

I get:

"Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Downloading CS50's Introduction to Computer Science [course-v1:HarvardX+CS50+X/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found."

abadfr commented 6 years ago

I am experiencing the same for two days now. Initially thought it was a problem on my machine.

Bigemul commented 6 years ago

I tried with the DemoX and two other courses and got the same error. I don't even know where I should look at.

damilareisaac commented 6 years ago

I am also getting this error. Working perfectly before now. I think edx.org changed some stuffs on there web layout.

balta2ar commented 6 years ago

@Bigemul I would like to remind you that this is an Open Source project, which means authors are working on it for free on their spare time. There is no 24/7 support and you use this software at your own risk, however, on the good side, if you're feeling constructive, instead of pointless ranting, you have a unique opportunity to help the project and its user by digging into the problem and suggesting a solution yourself. If you feel like actually helping yourself and others, it's a good idea to start looking at https://github.com/coursera-dl/edx-dl/blob/master/edx_dl/parsing.py, where code that parses EDX pages is located. Usually, something needs to be changed there. You may as well want to look at the closed pull requests that modified parsing.py to see what kind of changes are usually required to keep up with the changes on EDX side.

Bigemul commented 6 years ago

balta2ar: I wasn't trying to offend you and I would like to apologize if you were.

I just watched the entire source code and I'm not good enough to see what's wrong. At the same time, I was looking for a way to reduce file number.

Further in your comment, you say that I was ranting. I wasn't but your comment clearly shows you are. If you just want to rant, avoid answering to people.

I was just pointing the fact that a lot of people are interested in Udemy courses and less people are interested in MOOC like Edx.

I'll take a look at it and tell if I find something useful in order to keep it working. Thanks.

xunilrj commented 6 years ago

I do not guarantee that works for every course, but my changes worked for me:

See changes here: https://github.com/coursera-dl/edx-dl/compare/master...xunilrj:master

cksajil commented 6 years ago

Thanks @xunilrj. It worked for me.

brucelyu commented 6 years ago

@xunilrj Your solution worked for me too. Thanks!

Bigemul commented 6 years ago
Traceback (most recent call last):
  File "/usr/local/bin/edx-dl", line 11, in <module>
    sys.exit(main())
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 1038, in main
    all_units = extractor(all_urls, headers, file_formats)
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 469, in extract_all_units_in_parallel
    units = pool.map(mapfunc, urls)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
httplib.IncompleteRead: IncompleteRead(7483 bytes read, 709 more expected)
Exception in thread Thread-5 (most likely raised during interpreter shutdown):Exception in thread Thread-10 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
  File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunkedException in thread Thread-6 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
  File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 58, in get_page_contents
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1240, in https_open
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1200, in do_open
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1132, in getresponse
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 453, in begin
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 410, in _read_status
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
Exception in thread Thread-2 (most likely raised during interpreter shutdown):Exception in thread Thread-11 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
Traceback (most recent call last):

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
  File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read
<type 'exceptions.TypeError'>: 'NoneType' object is not callable

Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
  File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunkedException in thread Thread-13 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
  File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
  File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
  File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunked
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunked
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunked
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read<type 'exceptions.TypeError'>: 'NoneType' object is not callable

That's what the terminal gives me.

rafaelmendy commented 6 years ago

@xunilrj , It worked for me too. Thanks.

Bigemul commented 6 years ago

I uninstalled edx-dl, reinstalled it and it worked. I compiled parsing.py to see if parsing.pyc would really improve speed.

rbrito commented 6 years ago

The issue is not solved yet in the repository. You may have found a solution, @Bigemul, but it is not fixed in the project.

@xunilrj, I just took the liberty to create a pull request with your change, for me to merge it.

Thanks to all that tested @xunilrj's changes and confirmed that they fix the issue.

autoking77 commented 2 years ago

@rbrito i still have this issue even tho i downloaded the zip file, I changed some parts in the parsing file, this:

course_url = BASE_URL + course_soup.a['href'] to if course_soup.a['href'].endswith('home') or course_soup.a['href'].endswith('home/'): course_url = course_soup.a['href'] else: course_url = BASE_URL + course_soup.a['href']


if course_url.endswith('info') or course_url.endswith('info/') or course_url.endswith('course') or course_url.endswith('course/'): to if course_url.endswith('info') or course_url.endswith('info/') or course_url.endswith('course') or course_url.endswith('course/') or course_url.endswith('home') or course_url.endswith('home/'):


i got the URL of the course by changing this but i still have this issue : Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found.

....... please help me please if u can