scdl-org / scdl

Soundcloud Music Downloader
GNU General Public License v2.0
3.29k stars 331 forks source link

Track downloading refactoring, reducing amount of I/O operations, #493 #494

Closed es3n1n closed 2 months ago

es3n1n commented 2 months ago

This PR addresses #493.

~Finally no more temporary files! Everything is being processed in memory now.~ (tempfile is used for m4a encoding as ffmpeg doesn't support streaming of the ipod muxer :sadge:)

While trying to add that feature(outputting to stdout) I was initially planning on offloading all of the metadata work on ffmpeg. Still, it didn't really work because since we're now doing everything in memory(no tempfiles) we have to tell ffmpeg to encode stuff to stdout and we'll read that encoded stuff without spoiling any of our data on disk.

Because we have to pipe everything ffmpeg does to the stdout, i have discovered that ffmpeg in fact doesn't add the metadata for mp3 tracks if you're piping the result to stdout. Also, it doesn't support wav metadata at all(apparently because the way how everyone are doing that is not intended by the wav format :shrug:) and the list goes on. This is why i refactored the metadata stuff too and left it using the mutagen library, while working on the metadata things I also discovered a bug in the used library(this is why we aren't re-encoding original wavs and directly using the downloaded output. if you reencode wav with ffmpeg this library will crash) - https://github.com/quodlibet/mutagen/issues/654

Also because of these ffmpeg changes I somewhat rewrote some stuff related to the ffmpeg progress bars, from now on we have to have 2 threads that will be reading stdout and stderr simultaneously, a bit more detailed explanation can be found in the scdl.py file.

While working with the code, I refactored a bunch of stuff and it looks far better now IMO. I have also added a few more changes to make the work on this project a bit easier for the other contributors (added .idea to gitignore, added dev requirements list), hopefully we can introduce a flake8/mypy configuration for this project some day.

I have added a few more tests to test the downloading to stdout but this might be not enough, also please note that I marked test_m4a test to be skipped as I don't have a GO+ subscription.

I think that's it, sorry that it took me so long, I think I spent too much time trying to offload everything to ffmpeg and things were breaking here and there.