Open albertz opened 6 years ago
Interesting! To narrow down what's going wrong, can you please do some more investigation to narrow down the leak to specific actions in the audioread library? We might have a shot at fixing this if you can point to exactly what's being leaked.
See the script. Actually, the only thing I use is audio_open
, looped over a lot of FLAC files. I call it only indirectly via librosa.load(filename, sr=None)
, which is a very straight-forward usage of audio_open
.
I understand, but that still doesn't point to exactly where the leak is coming from. It would be awesome to have your help investigating exactly what gets leaked and when.
Yes, would be nice, but not sure if I have the time now (I already spent multiple hours in debugging this issue, and need to proceed with my actual work). I think you should be able to reproduce the issue with my script. As there as so many issues with Gstreamer anyway, I would maybe even suggest to completely remove it. My solution for now is to use PySoundFile instead of audioread. Btw., that is also what librosa is recommending.
OK! Please check back in if you ever get the chance to help.
I'm experiencing this issue with Beets 1.4.6 on Fedora 28.
I tried updating to Git master of audioread, as the unreleased version 2.1.7 contains an FD leak fix (https://github.com/beetbox/audioread/commit/72ed349c12a16ab741cb02abc4de8f2e8e7fe4ee). This change causes beet import
to either segfault or to log the following traceback:
Exception in thread Thread-6:
Traceback (most recent call last):
File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.6/site-packages/audioread/ffdec.py", line 69, in run
data = self.fh.read(self.blocksize)
ValueError: I/O operation on closed file
Exception in thread Thread-7:
Traceback (most recent call last):
File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.6/site-packages/audioread/ffdec.py", line 69, in run
data = self.fh.read(self.blocksize)
ValueError: PyMemoryView_FromBuffer(): info->buf must not be NULL
I haven't been able to reproduce this issue using audioread/decode.py
.
That's troubling. @RyanMarcus, have you encountered this?
Perhaps, to reproduce the problem, one would need to decode several files in a row?
I've managed to reproduce it now. The crash appears to be triggered if the .close() method is called before reading is complete. I'll open a separate MR with a fix (edit: https://github.com/beetbox/audioread/pull/78)
Huh, that's strange -- it looks like a race. When the process is started, it seems like the reading process is delegated to a thread (i.e. QueueReaderThread
). When close is called (possibly via __del__
), my change closes the FDs, but potentially leaves the reader thread running.
I haven't tested this, but it would explain why a partial read is causing the issue.
Here is a small script to demonstrate the issue. The memory consumption constantly grows (up to 8GB). See here for a discussion.