beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
MIT License
481 stars 108 forks source link

gstdec: Avoid leaking memory when reading audio data #84

Closed ssssam closed 5 years ago

ssssam commented 5 years ago

We were reading audio data with the Gst.Buffer.extract_dup() method. This allocates new memory using g_malloc() and returns it to the caller. The memory needs to be freed with g_free(), however the PyGObject bindings do not do this.

We can avoid problem by reading the audio data directory from the underlying Gst.Memory object. In this case the Python interpreter is responsible for copying the data and so it is able to correctly free the memory after it's no longer needed.

I tested this by calling pyacoustid.fingerprint() on 34 .MP3 files in sequence, and I saw the following difference:

The generated acoustid fingerprints were identical with and without the patch.

ssssam commented 5 years ago

This should fix https://github.com/beetbox/audioread/issues/62, and possibly https://github.com/beetbox/audioread/issues/40

ssssam commented 5 years ago

In theory we should be able to use Gst.Buffer.extract() to avoid duplicating memory that we can't free. In practice this leads to a crash. I think that there is a bug in the GStreamer gobject-introspection annotations causing that. I don't have enough information to open a bug against GStreamer right now though.