r0oth3x49 / udemy-dl

A cross-platform python based utility to download courses from udemy for personal offline use.
MIT License
4.85k stars 1.2k forks source link

udemy-dl memory leak when skipping downloads #487

Closed dragetd closed 4 years ago

dragetd commented 4 years ago

When the download starts skipping courses, something odd happens: The RAM usage starts increasing over time until the kernel OOM kills it. Maybe some sort of memory leak:

image

r0oth3x49 commented 4 years ago

@dragetd how, can you spot the code part leaking the memory? Because i don't see any.

dragetd commented 4 years ago

No idea. Skipping because of 'fie already exists' does not seem to be the cause. The skips I had were because of 403's from udemy (I wonder how I got these xD).

I noticed the system swapping after some time and at some time killing the process. I then restarted and it worked at first, until I noticed the system swapping again some time later where I took this screenshot.

Maybe it does only get triggered by 4xx server responses. Or maybe it was just a glitch and we simply ignore it, up to you. Just thought I'd post the finding.

r0oth3x49 commented 4 years ago

great i 'm closing the issue as invalid i don't see anything related i tried locally.

irshsheik commented 4 years ago

I faced the same problem. Although not many downloads were skipped. However, the memory is not released after a file is downloaded, The longer the download time of the course, the higher utilization of the RAM. Ideally, the program memory consumption should be = memory required for running(which is pretty low) + memory required to download the current file( on 1080p, it won't be more than a 1GB).

Here is the timeline : 1) This was after 5 mins task after 2) After few min processes 3) this was when the program crashed. I have attached the logs also task manager

The logs when it crashed completely.

[2020-09-17 23:49:19,665][udemy-dl] WARNI Subtitle : '137 Shift.ro'  (download skipped)
[2020-09-17 23:49:19,665][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,724][udemy-dl] WARNI Lecture : '138 Shift Solution'  (download skipped)
[2020-09-17 23:49:19,725][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,869][udemy-dl] WARNI Asset : '138 DLL-Shift.js'  (download skipped)
[2020-09-17 23:49:19,870][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,911][udemy-dl] WARNI Subtitle : '138 Shift Solution.pt'  (download skipped)
[2020-09-17 23:49:19,911][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,946][udemy-dl] WARNI Subtitle : '138 Shift Solution.fr'  (download skipped)
[2020-09-17 23:49:19,947][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,987][udemy-dl] WARNI Subtitle : '138 Shift Solution.es'  (download skipped)
[2020-09-17 23:49:19,988][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,028][udemy-dl] WARNI Subtitle : '138 Shift Solution.de'  (download skipped)
[2020-09-17 23:49:20,028][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,060][udemy-dl] WARNI Subtitle : '138 Shift Solution.id'  (download skipped)
[2020-09-17 23:49:20,060][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,089][udemy-dl] WARNI Subtitle : '138 Shift Solution.it'  (download skipped)
[2020-09-17 23:49:20,090][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,120][udemy-dl] WARNI Subtitle : '138 Shift Solution.pl'  (download skipped)
[2020-09-17 23:49:20,120][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,150][udemy-dl] WARNI Subtitle : '138 Shift Solution.ro'  (download skipped)
r0oth3x49 commented 4 years ago

I faced the same problem. Although not many downloads were skipped. However, the memory is not released after a file is downloaded, The longer the download time of the course, the higher utilization of the RAM. Ideally, the program memory consumption should be = memory required for running(which is pretty low) + memory required to download the current file( on 1080p, it won't be more than a 1GB).

Here is the timeline :

1. This was after 5 mins
   ![task after](https://user-images.githubusercontent.com/6179346/93511689-d2ea1200-f940-11ea-8824-0012862f4021.PNG)

2. After few min
   ![processes](https://user-images.githubusercontent.com/6179346/93511701-d7162f80-f940-11ea-9bc0-0938ad447400.PNG)

3. this was when the program crashed. I have attached the logs also
   ![task manager](https://user-images.githubusercontent.com/6179346/93511710-d8dff300-f940-11ea-809d-cccb84ccc46c.png)

The logs when it crashed completely.

[2020-09-17 23:49:19,665][udemy-dl] WARNI Subtitle : '137 Shift.ro'  (download skipped)
[2020-09-17 23:49:19,665][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,724][udemy-dl] WARNI Lecture : '138 Shift Solution'  (download skipped)
[2020-09-17 23:49:19,725][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,869][udemy-dl] WARNI Asset : '138 DLL-Shift.js'  (download skipped)
[2020-09-17 23:49:19,870][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,911][udemy-dl] WARNI Subtitle : '138 Shift Solution.pt'  (download skipped)
[2020-09-17 23:49:19,911][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,946][udemy-dl] WARNI Subtitle : '138 Shift Solution.fr'  (download skipped)
[2020-09-17 23:49:19,947][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:19,987][udemy-dl] WARNI Subtitle : '138 Shift Solution.es'  (download skipped)
[2020-09-17 23:49:19,988][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,028][udemy-dl] WARNI Subtitle : '138 Shift Solution.de'  (download skipped)
[2020-09-17 23:49:20,028][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,060][udemy-dl] WARNI Subtitle : '138 Shift Solution.id'  (download skipped)
[2020-09-17 23:49:20,060][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,089][udemy-dl] WARNI Subtitle : '138 Shift Solution.it'  (download skipped)
[2020-09-17 23:49:20,090][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,120][udemy-dl] WARNI Subtitle : '138 Shift Solution.pl'  (download skipped)
[2020-09-17 23:49:20,120][udemy-dl] ERROR Reason : [('system library', 'fopen', 'Too many open files'), ('BIO routines', 'BIO_new_file', 'system lib'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')]
[2020-09-17 23:49:20,150][udemy-dl] WARNI Subtitle : '138 Shift Solution.ro'  (download skipped)

thanks for the logs i will check it seems the issue only occurs when downloading subtitles

r0oth3x49 commented 4 years ago

@igagrock hopefully the current commit will resolve the issue.

irshsheik commented 4 years ago

@r0oth3x49 - I am gonna test right away. thanks

irshsheik commented 4 years ago

It is not fixed. The issue still remains. 1) the memory pretty much still stacks up as long as the program keeps running. 2) When the file is not video or subtitles, the download simply ignores them and throws the warning of nonType. 3) When the program exits successfully or interrupted using keyboard crtl+c, the memory used is never released and the process is still running.

Also a command like below should only download sub-titles in english. But it does not happen.

python3 udemy-dl/udemy-dl.py https://www.udemy.com/course/js-algorithms-and-data-structures-masterclass --chapter-start 20 --lecture-start 138 --sub-lang en 

In my case, I am using WSL (ubuntu 20.04) on windows10. I am not sure if it is the case with this configuration only or it will happen on any windows/Linux system.