Open MissGorgeousTech opened 4 years ago
This also happens with https://courses.edx.org/courses/course-v1:GTx+HI2018xII+1T2019/course/
I get this fixed by changing line 372 code in parsing.py. From 'return section_soup.a['href']' to 'return section_soup.ol'
aprilchew: Tried it, doesn't work.
aprilchew: Tried it, doesn't work.
Try section_soup.ol, remove the ['href']
@aprilchew Thank you very much. It does indeed fix the issue.
Here is the before and after Image for reference The commented line 372 shows the before, and the 373 line is the change.
python edx-dl.py -u user@user.com COURSE_URL
NOTE: If you have downloaded edx-dl using pip, the following steps won't work. To make it work you need to navigate to site-packages or dist-packages folder, find the edx-dl folder, look for parser.py and make the necessary changes as above.
EDIT: I've downloaded a few other courses as well, and this change has not yet broken any other downloads so far.
Hi,
A PR would be appreciated :)
Kind regards
Tigerjoy solution worked for me. However, be careful and not create another line, I just replaced the original code.
Hello smart guys. is there no one available in github who is able to fix the problems of downloading tutorials sucessfully from Edx website?. I have tried since 2019 to use this script to download my tutorials from Edx and it only stops after displaying my course contains. For me its really a pain because i have courses i desperately needed offline which have expired and i am still learning to code and not experiened to help in solving the downloading problems. Thanks
I can work on it, will send a pr soon. @ichit this is not the way you should be asking people to contribute. Be kind and respectful.
@Ankk98, a pull request that closes this would be welcome. Again: the simpler (and cleaner) the code, the better (since it will ease maintenance in the future when things break again--and they will).
@Ankk98 I do not mean to disrespect anyone or speak rudely. I quite understand perfectly that no one get paid for their work on this platform. I mistakenly showed my frustration due to my inability to download a course i desperately need for my thesis. I apologize for to anyone who feels offended. I thank all persons who helps to make life easier for others.
Hi,
I faced to the issue on a course today. So I decided to do a PR...
Here is the PR on the table.
Anyone know who are the owner(s) of this project? I see lots of PRs pending merge to master.
If anyone can help here it will nice :) Maybe @rbrito?
Thanks
@aprilchew Thank you very much. It does indeed fix the issue.
For those who are still having trouble, here are the steps that you can follow.
1. Clone or download as .zip **https://github.com/coursera-dl/edx-dl** 2. Extract the .zip using **"Extract** Here" option. 3. Navigate to the following folder **edx-dl-master/edx_dl** 4. Open **parsing.py** with your favorite text editor that displays line numbers. 5. Scroll down to line 372, and change **return section_soup.a['href']** to **return section_soup.ol**
Here is the before and after Image for reference The commented line 372 shows the before, and the 373 line is the change.
1. Go up a directory, inside **edx-dl-master** 2. To download courses now, you must use the following: - `python edx-dl.py -u user@user.com COURSE_URL`
NOTE: If you have downloaded edx-dl using pip, the following steps won't work. To make it work you need to navigate to site-packages or dist-packages folder, find the edx-dl folder, look for parser.py and make the necessary changes as above.
EDIT: I've downloaded a few other courses as well, and this change has not yet broken any other downloads so far.
Hi, I have followed the whole procedure according to you but there has to be an empty folder created. what should I do next step? please suggest.
@tigerjoy
For those who are still having trouble, here are the steps that you can follow.
Clone or download as .zip https://github.com/coursera-dl/edx-dl
Extract the .zip using "Extract Here" option.
Navigate to the following folder edx-dl-master/edx_dl
Open parsing.py with your favorite text editor that displays line numbers.
Scroll down to line 372, and change return section_soup.a['href'] to return section_soup.ol
Here is the before and after Image for reference The commented line 372 shows the before, and the 373 line is the change.
Go up a directory, inside edx-dl-master
To download courses now, you must use the following: -
python edx-dl.py -u user@user.com COURSE_URL
NOTE: If you have downloaded edx-dl using pip, the following steps won't work. To make it work you need to navigate to site-packages or dist-packages folder, find the edx-dl folder, look for parser.py and make the necessary changes as above.
Hi, I have followed the whole procedure according to you but there has to be an empty folder created. what should I do next step? please suggest.
Subject of the issue
when trying to download the course videos specifically one course (listed bellow), it gives the error TypeError: 'NoneType' object is not subscriptable. Tried with others and doesn't give errors and works fine.
Traceback (most recent call last): File "/usr/local/bin/edx-dl", line 11, in
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/edx_dl/edx_dl.py", line 1023, in main
for selected_course in selected_courses}
File "/usr/local/lib/python3.6/dist-packages/edx_dl/edx_dl.py", line 1023, in
for selected_course in selected_courses}
File "/usr/local/lib/python3.6/dist-packages/edx_dl/edx_dl.py", line 186, in get_available_sections
sections = page_extractor.extract_sections_from_html(page, BASE_URL)
File "/usr/local/lib/python3.6/dist-packages/edx_dl/parsing.py", line 403, in extract_sections_from_html
for i, section_soup in enumerate(sections_soup, 1)]
File "/usr/local/lib/python3.6/dist-packages/edx_dl/parsing.py", line 403, in
for i, section_soup in enumerate(sections_soup, 1)]
File "/usr/local/lib/python3.6/dist-packages/edx_dl/parsing.py", line 372, in _make_url
return section_soup.a['href']
environment
Steps to reproduce
https://courses.edx.org/courses/course-v1:IBM+PY0101EN+1T2020/cou rse/