beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
MIT License
491 stars 108 forks source link

audioread race condition freezes when cleaning up #57

Open antlarr opened 7 years ago

antlarr commented 7 years ago

I have a flac file that causes acoustid to freeze when doing:

import acoustid
acoustid.fingerprint_file('01.flac')

If I press Ctrl-C, I get the following backtrace:

Traceback (most recent call last):
  File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/site-packages/audioread/gstdec.py", line 149, in run
    self.loop.run()
  File "/usr/lib64/python3.6/site-packages/gi/overrides/GLib.py", line 588, in run
    raise KeyboardInterrupt

Btw, the file plays perfectly fine with ffplay, mpv, mplayer... Also, I tried reading the audio blocks with simple audioread code like:

with audioread.audio_open(path) as f:
    for x in f:
        print(x)

and that seems to work fine.

What makes me think that it's an audioread issue and not an acoustid issue is that in audioread's gstdec.py, if I put return None as the first line of get_loop_thread, it doesn't freeze (works normally).

I've tried to debug a bit the problem and noticed that acoustid has a _fingerprint_file_audioread function defined as:

def _fingerprint_file_audioread(path, maxlength):
    """Fingerprint a file by using audioread and chromaprint."""
    try:
        with audioread.audio_open(path) as f:
            duration = f.duration
            fp = fingerprint(f.samplerate, f.channels, iter(f), maxlength)
    except audioread.DecodeError:
        raise FingerprintGenerationError("audio could not be decoded")
    return duration, fp

If I add time.sleep(0.00000000000001) at the end of the with block (just after the fingerprint call), sometimes it works as expected, and if I use time.sleep(0.1) instead, it seems to "fix it" and doesn't freeze anymore, so it seems to be some kind of race condition.

After some more debugging I found out that the exact line where it's freezing is in the call to self.pipeline.set_state(Gst.State.NULL) inside GstAudioFile.close but I'm not sure how to continue from there since I don't have much experience with gstreamer.

I'm using audioread 2.1.5, gstreamer 1.12.3 and pyacoustid 1.1.5 with chromaprint 1.4.2 . I also tried with gstreamer 1.12.2 and audioread 2.1.4.

Btw, another solution I found was to do ffmpeg -i 01.flac output.flac. The resulting file is processed correctly and never freezes.

sampsyo commented 7 years ago

Wow! That’s a pretty crazy problem. Perhaps what’s most maddening is that it only occurs inside of the pyacoustid setup. Has it really been impossible to reproduce while just invoking audioread directly?

GStreamer can be extraordinarily difficult to debug. Usually, however, the right place to start is by putting it into its verbose logging mode, which can be done via environment variables. Those (very long) logs can help pin down the difference between a run that works and one that doesn’t, which is the first step to diagnosing a race condition.