coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 640 forks source link

AttributeError: 'HTMLParser' object has no attribute 'unescape' #688

Open FrankWorldview opened 1 year ago

FrankWorldview commented 1 year ago

Subject of the issue

Output directory: Downloaded Traceback (most recent call last): File "D:\edx-dl\edx-dl.py", line 8, in edx_dl.main() File "D:\edx-dl\edx_dl\edx_dl.py", line 1261, in main download(args, selections, all_units, headers) File "D:\edx-dl\edx_dl\edx_dl.py", line 993, in download course_name = directory_name(selected_course.name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\edx-dl\edx_dl\utils.py", line 52, in directory_name result = clean_filename(initial_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\edx-dl\edx_dl\utils.py", line 126, in clean_filename s = h.unescape(s) ^^^^^^^^^^ AttributeError: 'HTMLParser' object has no attribute 'unescape'

csyezheng commented 1 year ago

Fixed: Fix HTMLParser.unescape error in Python 3.9 and above

FrankWorldview commented 1 year ago

Thank you very much. The script can run now.

However, in the "Downloaded" folder, I can only see many subfolders that are empty. There are no html files downloaded. Is this normal?

csyezheng commented 1 year ago

Sorry for taking so long to reply. Added support for downloading html files when the unit type is html, problem, discussion, survey, etc.