CPJKU / madmom

Python audio and music signal processing library
https://madmom.readthedocs.io
Other
1.35k stars 206 forks source link

Fixing multiprocessing memory leak in downbeats.py #505

Open igalcohenhadria opened 2 years ago

igalcohenhadria commented 2 years ago

Moving the multiprocessing pool initialisation to the process area. (using multiprocessing.pool in the init section cause the pool to never release thus creating zombies)

Changes proposed in this pull request

Please provide a detailed description of the changes proposed in this pull request. This can either be a textual description or a simple list.

-

This pull request fixes #.

Remaining TODOs before this pull request can be merged

Please remove this section (inluding the headline above) if there aren't any open TODOs. If there are, please remove this text, fill the items below and add a [WIP] prefix to the pull request's name.

superbock commented 2 years ago

That's a bit weird, since I exactly did the opposite — moving the pool creation to __init__() — because it was leaking memory and file descriptors when being set up in process(), at least this is what the comment I left in processors.py suggests.

May I ask in which situations these zombie processes appear?

If there's a problem with zombies, we should probably aim for a dedicated process/pool destruction mechanism or use a dedicated pooling solution which does what we need.

igalcohenhadria commented 2 years ago

Hello,

I was using the processor in a simple loop and ended up with a memory allocation failure after a few thousands …

I tracked down the problem there and fixed it for me the way I sent in the pool request …

I ll try to send you more info tomorrow if you want

Envoyé de mon iPhone

Le 19 juin 2022 à 15:00, Sebastian Böck @.***> a écrit :

 That's a bit weird, since I exactly did the opposite — moving the pool creation to init() — because it was leaking memory and file descriptors when being set up in process(), at least this is what the comment I left in processors.py suggests.

May I ask in which situations these zombie processes appear?

If there's a problem with zombies, we should probably aim for a dedicated process/pool destruction mechanism or use a dedicated pooling solution which does what we need.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

superbock commented 2 years ago

Maybe you can paste a snippet of the code you're using. Using the processor in a loop doesn't sound too efficient... usually it's much better to instantiate the processor only once and then call it either in a loop or with all files at once and let it do the processing (in parallel) for you.

igalcohenhadria commented 2 years ago

Here the code and the running test in a youtube video.

https://www.youtube.com/watch?v=4DY8wW19QrE

The code just loop freely and you will see zombies appearing in the processor counter i made. Initializing once may be effective to limit the number of zombies, but after python3 stops, zombies would still remains.

Also good practice around multiprocessing is to use the pool inside a "with" or closing the pool at some point manually.

An alternative to the patch i gave would be to keep the pool as an attribute in the init and close the pool in the del())