jiaaro / pydub

Manipulate audio with a simple and easy high level interface
http://pydub.com
MIT License
8.73k stars 1.03k forks source link

How does pydub interact with FFmpeg (Ctypes, wrapper, etc)? Can we speed up performance? #405

Open aspen1135 opened 5 years ago

aspen1135 commented 5 years ago

First off, I want to say this libraries pretty great and easy to work with, and I want to pass my Kudos on to the author who has been working hard on it. For the last 6 months or so, I have been working on a project that's oriented mostly around performing batch audio commands with a high degree of user-customization and organizational tools added. The idea was to create a model-based program that could automatically organize producers/content-creators environments and help simply any large repositories while running in the background tray.

However, I am not sure if I will be relying on PyDub for the development of the project since I am running into performance issues. For the program to be practical, it needs to convert thousands of files within a reasonable time frame so the user can drag and drop new libraries into their work environment and follow the output model. After much testing and even playing around with threading a bit, PyDub performs admirably but it does not seem to be the best performer.

There is another library called 'PyAV', which provides pythonic bindings for FFmpeg's libraries. It seems that when performing 1:1 vanilla transcoding comparisons between it and PyDub, PyAV comes out on top with a whopping x17 performance advantage (3000 files @ 44secs vs 7.5mins). It got me wondering how exactly PyDub is handling FFmpeg or the export process. I am by no means a back-end developer and I don't have the expertise to dissect the source code for either of the two libraries-- especially when c-types are involved (my C knowledge is very limited). However, these kinds of performance differences has me wondering if there could be some overlooked optimizations that could speed things up somewhere.

I have also done some performance comparisons running PyDub against FFmpeg in a subprocess shell. Pydub does perform much better than the shell method, but dispatching processes to a shell is incredibly slow in the first place. I remember giving it a couple gigs to work with (7000 files?) and it was going to take over a 30 hours to complete using FFmpeg in a shell. Some sample libraries from third party VST vendors, like Native Instruments, can be up to 50GB or more so this is obviously not practical enough to use for the long-term goal of my project.

The contending library, PyAV, performs great but it's very frustrating to work with and the API documentation needs to be improved. There is also some lacking compatibility for more advanced features like filtering (for audio) which has created an entirely new set of problems for my project. I wanted to add a few features such as strip silence, normalization, and more (which can be easily done in PyDub) but it seems that the libraries community/developers are focusing more on it's use for video.

I would like to keep using PyDub, but as of right now it seems too slow. Could we potentially see better performance for this library in the future or is somewhat limited to how the libraries code was structured to interact with FFmpeg? Again, thank you for your work on this library. It's API and documentation has been easy to work with so far.

aspen1135 commented 5 years ago

And if your wondering what code was being executed:

def open_audio(input):
    logger.info(f"Event: File open {input}")
    audio = pd.AudioSegment.from_file(input)
    return audio
def export(audio, in_name, out_path, extension):

    out_name = out_path + '/' + in_name + extension

    audio.export(out_name)

    print('EXPORTED: ', out_name)
    print()

Both PyDub and PyAV were calling their functions within the same scope. Side by Side comparison.

Mattwmaster58 commented 4 years ago

Not an expert by any means either, but doing stuff the way PyAV does it will make things far more complicated. Like PyAV, the library would need to be compiled by the user upon downloading, which sets the bar for using this package much higher.

However, if this issue goes anywhere, eg prebuilt binaries for most platforms, it's a lot more likely it could be leveraged by pydub for a nice speed increase.