Gaarv / kadenze-dl

Small application to download Kadenze (https://www.kadenze.com) videos for courses you enrolled in
MIT License
50 stars 16 forks source link

Can't download archived courses #19

Closed fatimaj786 closed 3 years ago

fatimaj786 commented 3 years ago

error can't start download ...

Traceback (most recent call last): File "kadenze-dl.py", line 10, in main() File "kadenze-dl.py", line 5, in main client = KadenzeClient()..........

Gaarv commented 3 years ago

Traceback is incomplete but seeing this kind of error, it seems you haven't installed required dependencies in your environment (pip install -r requirements.txt), check the readme for more information.

No issue on my side with a fresh install.

fatimaj786 commented 3 years ago

oh ok, yeah i tried multiple times. requirements already satisfied. Also tried fresh install... I don't know, thanks.

fatimaj786 commented 3 years ago

i can sign in but not download

Gaarv commented 3 years ago

Ok thanks for the added info, do the directory used for download in configuration exists ? It's not created at runtime. Also, check that don't have an exception during reqs install, particularly on lxml which needs to be installed with binaries under some systems.

fatimaj786 commented 3 years ago

ok, thanks. I 'll check.

fatimaj786 commented 3 years ago

Ok, the problem is with the TensorFlow course. It has been archived and can't be downloaded. Others can be downloaded easily. Is there a way to download archived courses? Thanks.

Gaarv commented 3 years ago

I see. Indeed, the landing page for "courses" has only enrolled courses and archived ones are in another tab, but which is triggered from javascript.

It should be possible, just not with the current version. I might work on this when I have the chance.

ilmarilahti commented 3 years ago

Hi!

I had the same problem with downloading a couple of permanently archived courses that I really liked. A massive hack to get them working is to add the following kind of array after line 27 in kadenzeclient.py: courses = ["loop-repetition-and-variation-in-music-i"]

You get the course names from the URL of archived courses the same way you'd get them from the other courses. Otherwise the script seems to work perfectly with the archived courses :)

Gaarv commented 3 years ago

Nice workaround ;) Indeed the actual challenge is to get the list of archived courses which is only accessible from a "real browser" since javascript is used on the site for that.

I might have to move to browser emulation for that, which will take some time.

Gaarv commented 3 years ago

I have a working version in branch "playwright": https://github.com/Gaarv/kadenze-dl/tree/playwright if you need it.

Dependencies have changed so a pip install -r requirements.txt is needed after cloning the branch.

I still have a few tweaks to do (mostly log, some tests and possible errors) but I was able to download archived courses just with the "all" keyword in configuration.

fatimaj786 commented 3 years ago

Thank you both...

ilmarilahti commented 3 years ago

That was fast! Thank you :D

Gaarv commented 3 years ago

Theres still some work to do like checking correct order of videos and such but it should be ok by the end of the week I think :)

Gaarv commented 3 years ago

New version have been merged. Since website inside data include much more information like ordering, session and video names, previously downloaded courses will have differents names.

On the other hand, it will be much more reliable than before when I had to rely on the video filename and not every courses followed naming convention.

I advise either to download only whats needed or redownloading everything.

Anyway, it was a much needed update :)

fatimaj786 commented 3 years ago

Can download half the course and then get this error...

(node:18940) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 45) (node:18940) UnhandledPromiseRejectionWarning: Error: EPIPE: broken pipe, write at Socket._write (internal/net.js:54:25) at doWrite (_stream_writable.js:403:12) at writeOrBuffer (_stream_writable.js:387:5) at Socket.Writable.write (_stream_writable.js:318:11) at Transport.send (C:\Python39\Lib\site-packages\playwright\driver\package\lib\protocol\transport.js:47:25) at DispatcherConnection.dispatcherConnection.onmessage (C:\Python39\Lib\site-packages\playwright\driver\package\lib\cli\driver.js:43:59) at DispatcherConnection.sendMessageToClient (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\dispatcher.js:136:14) at new Dispatcher (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\dispatcher.js:63:30) at new JSHandleDispatcher (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\jsHandleDispatcher.js:25:9) at Object.createHandle (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\elementHandlerDispatcher.js:22:90) (node:18940) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 46) (node:18940) UnhandledPromiseRejectionWarning: Error: EPIPE: broken pipe, write at Socket._write (internal/net.js:54:25) at doWrite (_stream_writable.js:403:12) at writeOrBuffer (_stream_writable.js:387:5) at Socket.Writable.write (_stream_writable.js:318:11) at Transport.send (C:\Python39\Lib\site-packages\playwright\driver\package\lib\protocol\transport.js:47:25) at DispatcherConnection.dispatcherConnection.onmessage (C:\Python39\Lib\site-packages\playwright\driver\package\lib\cli\driver.js:43:59) at DispatcherConnection.sendMessageToClient (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\dispatcher.js:136:14) at new Dispatcher (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\dispatcher.js:63:30) at new JSHandleDispatcher (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\jsHandleDispatcher.js:25:9) at Object.createHandle (C:\Python39\Lib\site-packages\playwright\driver\package\lib\dispatchers\elementHandlerDispatcher.js:22:90) (node:18940) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 47)