Closed Bigemul closed 6 years ago
I am experiencing the same for two days now. Initially thought it was a problem on my machine.
I tried with the DemoX and two other courses and got the same error. I don't even know where I should look at.
I am also getting this error. Working perfectly before now. I think edx.org changed some stuffs on there web layout.
@Bigemul I would like to remind you that this is an Open Source project, which means authors are working on it for free on their spare time. There is no 24/7 support and you use this software at your own risk, however, on the good side, if you're feeling constructive, instead of pointless ranting, you have a unique opportunity to help the project and its user by digging into the problem and suggesting a solution yourself. If you feel like actually helping yourself and others, it's a good idea to start looking at https://github.com/coursera-dl/edx-dl/blob/master/edx_dl/parsing.py, where code that parses EDX pages is located. Usually, something needs to be changed there. You may as well want to look at the closed pull requests that modified parsing.py
to see what kind of changes are usually required to keep up with the changes on EDX side.
balta2ar: I wasn't trying to offend you and I would like to apologize if you were.
I just watched the entire source code and I'm not good enough to see what's wrong. At the same time, I was looking for a way to reduce file number.
Further in your comment, you say that I was ranting. I wasn't but your comment clearly shows you are. If you just want to rant, avoid answering to people.
I was just pointing the fact that a lot of people are interested in Udemy courses and less people are interested in MOOC like Edx.
I'll take a look at it and tell if I find something useful in order to keep it working. Thanks.
I do not guarantee that works for every course, but my changes worked for me:
See changes here: https://github.com/coursera-dl/edx-dl/compare/master...xunilrj:master
Thanks @xunilrj. It worked for me.
@xunilrj Your solution worked for me too. Thanks!
Traceback (most recent call last):
File "/usr/local/bin/edx-dl", line 11, in <module>
sys.exit(main())
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 1038, in main
all_units = extractor(all_urls, headers, file_formats)
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 469, in extract_all_units_in_parallel
units = pool.map(mapfunc, urls)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
httplib.IncompleteRead: IncompleteRead(7483 bytes read, 709 more expected)
Exception in thread Thread-5 (most likely raised during interpreter shutdown):Exception in thread Thread-10 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunkedException in thread Thread-6 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 58, in get_page_contents
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1240, in https_open
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1200, in do_open
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1132, in getresponse
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 453, in begin
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 410, in _read_status
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
Exception in thread Thread-2 (most likely raised during interpreter shutdown):Exception in thread Thread-11 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunkedException in thread Thread-13 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 113, in worker
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 65, in mapstar
File "/Library/Python/2.7/site-packages/edx_dl/edx_dl.py", line 438, in extract_units
File "/Library/Python/2.7/site-packages/edx_dl/utils.py", line 64, in get_page_contents
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 355, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunked
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunked
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 648, in _read_chunked
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 703, in _safe_read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 391, in read<type 'exceptions.TypeError'>: 'NoneType' object is not callable
That's what the terminal gives me.
@xunilrj , It worked for me too. Thanks.
I uninstalled edx-dl, reinstalled it and it worked. I compiled parsing.py to see if parsing.pyc would really improve speed.
The issue is not solved yet in the repository. You may have found a solution, @Bigemul, but it is not fixed in the project.
@xunilrj, I just took the liberty to create a pull request with your change, for me to merge it.
Thanks to all that tested @xunilrj's changes and confirmed that they fix the issue.
@rbrito i still have this issue even tho i downloaded the zip file, I changed some parts in the parsing file, this:
course_url = BASE_URL + course_soup.a['href'] to if course_soup.a['href'].endswith('home') or course_soup.a['href'].endswith('home/'): course_url = course_soup.a['href'] else: course_url = BASE_URL + course_soup.a['href']
if course_url.endswith('info') or course_url.endswith('info/') or course_url.endswith('course') or course_url.endswith('course/'): to if course_url.endswith('info') or course_url.endswith('info/') or course_url.endswith('course') or course_url.endswith('course/') or course_url.endswith('home') or course_url.endswith('home/'):
i got the URL of the course by changing this but i still have this issue : Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found.
....... please help me please if u can
🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.
Subject of the issue
Hi
When I try to down vidéos, I get:
"Downloading CS50's Introduction to Computer Science [course-v1:HarvardX+CS50+X/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found."
Your environment
Steps to reproduce
edx-dl -u myadress@gmail.com https://courses.edx.org/courses/course-v1:HarvardX+CS50+X/course/ -o /Volumes/Test/e-learning/Edx
Expected behaviour
I'd like something to download
Actual behaviour
I get:
"Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Downloading CS50's Introduction to Computer Science [course-v1:HarvardX+CS50+X/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found."