spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
982 stars 145 forks source link

What to do when underflows occur #384

Open Martmists-GH opened 2 years ago

Martmists-GH commented 2 years ago

I'm trying to use sounddevice in a graphical application, so I run it in a background thread in blocking mode. However, it often gives an underflow and I'm not sure what to do about this. I tried passing extra np.zeros arrays, but those don't seem to solve it. After about 5 minutes, the audio is ~5 seconds behind real-time.

In the code below, process() gets called once every ~21ms

import kaudio
import numpy as np

from kaudio_app.nodes.abstract.base_node import BaseNode
from kaudio_app.sounddevice_handler import device_map, open_stream

class AudioOutput(BaseNode):
    NODE_NAME = "Audio Output"

    def __init__(self):
        self.device = ""

        super().__init__()
        self.stream = None
        self.add_combo_menu("device", "Output device", list(device_map().keys()))

    def __del__(self):
        if self.stream is not None:
            self.stream.stop()
            self.stream.close()
            self.stream = None

    def get_new_node(self, stereo: bool) -> kaudio.BaseNode:
        if stereo != self.stereo and self.stream is not None:
            stream = self.stream
            self.stream = None

            stream.stop()
            stream.close()
            stream = open_stream(
                device_map()[self.device],
                stereo,
                False
            )
            stream.start()
            self.stream = stream

        return kaudio.OutputNode(stereo)

    def set_property(self, name, value):
        if name == "device":
            self.device = value
            stream_idx = device_map()[value]

            if self.stream is not None:
                stream = self.stream
                self.stream = None
                stream.stop()
                stream.close()

            stream = open_stream(
                stream_idx,
                self.stereo,
                False
            )
            stream.start()
            self.stream = stream
        else:
            super().set_property(name, value)

    def process(self):
        self.node.process()

        if self.stream is None:
            return

        if self.stereo:
            arr = np.zeros((1024, 2), dtype=np.float32)
            arr[::, 0] = self.node.buffer_left
            arr[::, 1] = self.node.buffer_right
        else:
            arr = np.zeros((1024, 1), dtype=np.float32)
            arr[::, 0] = self.node.buffer
        underflowed = self.stream.write(arr)
        if underflowed:
            print("Underflowed")
            # underflowed = self.stream.write(np.zeros((1024, 2 if self.stereo else 1), dtype=np.float32))
            # if underflowed:
            #     print("Underflowed x2")

    def process_empty(self):
        if self.stream is None:
            return

        underflowed = self.stream.write(np.zeros((1024, 2 if self.stereo else 1), dtype=np.float32))
        if underflowed:
            print("Underflowed")
            # underflowed = self.stream.write(np.zeros((1024, 2 if self.stereo else 1), dtype=np.float32))
            # if underflowed:
            #     print("Underflowed x2")

where sounddevice_handler.py contains the following:

import sounddevice as sd

LATENCY = 0.1

def device_map():
    return {it['name']: j for j, it in enumerate(sd.query_devices())
            if it['max_input_channels'] >= 2 and it['max_output_channels'] >= 2}

def open_stream(index: int, stereo: bool, is_input: bool):
    if is_input:
        return sd.InputStream(device=index,
                              channels=2 if stereo else 1,
                              latency=LATENCY,
                              samplerate=48000,
                              blocksize=1024,
                              dtype='float32')
    else:
        return sd.OutputStream(device=index,
                               channels=2 if stereo else 1,
                               latency=LATENCY,
                               samplerate=48000,
                               blocksize=1024,
                               dtype='float32')
Martmists-GH commented 2 years ago

Even after just 3 underflows, the delay is starting to get very noticeable to the point where it's pretty much unusable. Other times it underflows a couple dozen times per second and is delayed 3 seconds after 10 seconds have passed.

mgeier commented 2 years ago

I've never really used "blocking mode" and I honestly don't really understand how to use it except in the most trivial situations.

Regarding your problem, I guess it all boils down to how the process() method is called. As far as I understand (and I don't really understand it), the blocking functions are supposed to be called back-to-back, without an artificial break between them. They are "blocking" after all, so you don't need to wait between calls, they are waiting on their own!

I normally prefer using "callback mode", where the callback function is automatically called at the proper times.

Martmists-GH commented 2 years ago

Sadly callback mode is not possible here due to the GIL and having to sync up threads as a result. Even so, that's not the problem, as it's a portaudio bug with opening the same device twice.

leimao commented 2 years ago

I encountered output underflow once. By adjusting latency to a higher value (>0.1), the output underflow was gone.