r0oth3x49 / udemy-dl

A cross-platform python based utility to download courses from udemy for personal offline use.
MIT License
4.83k stars 1.19k forks source link

HtmlParser not having unescape attribute error while downloading #571

Closed alokverma75 closed 3 years ago

alokverma75 commented 3 years ago

[i] : Downloading subtitle(s) [i] : Downloading (001 Course Agenda.id) [*] : 13.46KB/13.46KB 100.00% |##############################| 10.62kB/s [i] : Downloaded (001 Course Agenda.id) Traceback (most recent call last): File "D:\Downloads\udemy-dl-1.0\udemy-dl-1.0\udemy-dl.py", line 606, in main() File "D:\Downloads\udemy-dl-1.0\udemy-dl-1.0\udemy-dl.py", line 576, in main udemy_obj.course_download( File "D:\Downloads\udemy-dl-1.0\udemy-dl-1.0\udemy-dl.py", line 361, in course_download self.downalod_subtitles( File "D:\Downloads\udemy-dl-1.0\udemy-dl-1.0\udemy-dl.py", line 123, in downalod_subtitles self.convert(filename=filename, keep_vtt=keep_vtt) File "D:\Downloads\udemy-dl-1.0\udemy-dl-1.0\udemy\vtt2srt.py", line 105, in convert timecode = self._generate_timecode(seq, unescapeHTML(line)) File "D:\Downloads\udemy-dl-1.0\udemy-dl-1.0\udemy\utils.py", line 196, in unescapeHTML data = clean.unescape(s) AttributeError: 'HTMLParser' object has no attribute 'unescape'

I am using Python version 3.9. If any specific version of python is required for this like 2.7?

softsmith commented 3 years ago

If it is of any help, I have the same problem. The output is as follows

[i] : Downloading chapter : (1 of 14)
[i] : Chapter (01 Intro_ The 10 Days of React)
[i] : Found (10) lecture(s).

[i] : Lecture(s) : (1 of 10)
[i] : Downloading (001 What Problem Does React Solve)
[*] : 136.39MB/136.39MB 100.00% |##############################| 3.99MB/s
[i] : Downloaded (001 What Problem Does React Solve
)

[i] : Downloading asset(s)
[i] : Downloading (001 Approach-1-Reference-Code)
[i] : Downloaded (001 Approach-1-Reference-Code)

[i] : Downloading asset(s)
[i] : Downloading (001 Approach-2-Reference-Code)
[i] : Downloaded (001 Approach-2-Reference-Code)

[i] : Downloading subtitle(s)
[i] : Downloading (001 What Problem Does React Solve.en)
[*] : 17.69KB/17.69KB 100.00% |##############################| 22.48kB/s
[i] : Downloaded (001 What Problem Does React Solve
.en)
Traceback (most recent call last):
File "E:\Training\MyUdemy\Udemy-dl\udemy-dl-master\udemy-dl.py", line 611, in
main()
File "E:\Training\MyUdemy\Udemy-dl\udemy-dl-master\udemy-dl.py", line 581, in main
udemy_obj.course_download(
File "E:\Training\MyUdemy\Udemy-dl\udemy-dl-master\udemy-dl.py", line 363, in course_download
self.download_subtitles(
File "E:\Training\MyUdemy\Udemy-dl\udemy-dl-master\udemy-dl.py", line 123, in download_subtitles
self.convert(filename=filename, keep_vtt=keep_vtt)
File "E:\Training\MyUdemy\Udemy-dl\udemy-dl-master\udemy\vtt2srt.py", line 108, in convert
timecode = self._generate_timecode(seq, unescapeHTML(line))
File "E:\Training\MyUdemy\Udemy-dl\udemy-dl-master\udemy\utils.py", line 202, in unescapeHTML
data = clean.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'

r0oth3x49 commented 3 years ago

@softsmith and @alokverma75 can you guys stick to python 3.8 i didn't tested with python3.9 as of yet will test once i get free time and fix for now it should work fine with python3.8.x

alokverma75 commented 3 years ago

Hi Nasir,

Thanks a lot for the reply.

I moved to python 3.8.6. and now after installing 2 more required packages m3ua8 and skyscraper its giving below error after login successfully using cookie

Traceback (most recent call last):on .. File "udemy-dl.py", line 606, in main() File "udemy-dl.py", line 576, in main udemy_obj.course_download( File "udemy-dl.py", line 283, in course_download course = udemy.course( File "E:\Udemy Courses\udemy-dl-1.0\udemy\udemy.py", line 40, in course return Udemy(url, username, password, cookies, basic, skip_hls_stream, callback) File "E:\Udemy Courses\udemy-dl-1.0\udemy\internal.py", line 65, in init super(InternUdemyCourse, self).init(*args, **kwargs) File "E:\Udemy Courses\udemy-dl-1.0\udemy\shared.py", line 300, in init self._fetch_course() File "E:\Udemy Courses\udemy-dl-1.0\udemy\internal.py", line 78, in _fetch_course self._info = self._real_extract(self._url, skip_hls_stream=self._skip_hls_stream) File "E:\Udemy Courses\udemy-dl-1.0\udemy\extract.py", line 603, in _real_extract course_id, course_info = self._extract_course_info(url) File "E:\Udemy Courses\udemy-dl-1.0\udemy\extract.py", line 274, in _extract_course_info portal_name, course_name = self._course_name(url) TypeError: cannot unpack non-iterable NoneType object

Please share any specific Python version needed for this?

Regards Alok

On Sun, Oct 11, 2020 at 3:11 PM Nasir Khan notifications@github.com wrote:

Closed #571 https://github.com/r0oth3x49/udemy-dl/issues/571 via 4608ec3 https://github.com/r0oth3x49/udemy-dl/commit/4608ec3e29e94e74ef7146b894e97002498acd2b .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/r0oth3x49/udemy-dl/issues/571#event-3863519811, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM7STLMSFRU2DWI65KUMRMDSKF4VFANCNFSM4SGSKUSQ .

rudluff commented 3 years ago

@alokverma75 to fix the cannot unpack non-iterable thing, you have to precede the course url with "www", it can't be "https://udemy.com/course-name", has to be "https://www.udemy.com/course-name"