spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
980 stars 145 forks source link

Advice on asynchronous playback #525

Open owenhdiba opened 3 months ago

owenhdiba commented 3 months ago

I'm using your module to provide "audio feedback" for a digital sensor. The idea is I have a sensor with one-dimensional time-series data being read in at 40 Hz. I have a target value that I want the sensor to read. If the sensor is close to that target then a pure sine wave is played, if it's not then the pure tone is superposed with white noise of an amplitude proportional to the error. The audio and sensor reading is done asynchronously. I used your example of playing a sine wave and the asynchronous examples. What I've got actually works, but I don't fully understand the API and I'm certain I'm doing some very ugly stuff. Would just like a nudge in the right direction if that's ok! I have a minimal working example below, I should say that the actually callback function I'm using is a fair bit more complicated but this gives the gist.

import asyncio
import sounddevice as sd
from timeit import default_timer as timer
import numpy as np

#white noise
def white(N):
    return np.random.randn(N)

start_idx = 0
async def play_audio(sensor):
    loop = asyncio.get_event_loop()
    event = asyncio.Event()
    samplerate = sd.query_devices(1, 'output')['default_samplerate']
    freq = 500.
    def callback(outdata, frames, time, status):
        if status:
            print(status, file=sys.stderr)
        global start_idx
        y = sensor.data[-1]
        t = (start_idx + np.arange(frames)) / samplerate
        t = t.reshape(-1, 1)
        sine_wave = 0.5 * np.sin(2 * np.pi * freq * t)
        scale_factor = abs(sensor.target - y) / sensor.target

        # if latest data is within threshold of target play pure tone
        if y > sensor.target - sensor.threshold and y < sensor.target + sensor.threshold:
            outdata[:] = sine_wave
        # else play noisy pure tone with amplitude 
        # proportional to error 
        else: 
            noisy_wave = scale_factor * white(len(t)).reshape(-1,1)
            noisy_wave += sine_wave
            outdata[:] = noisy_wave[:frames]
        start_idx += frames

    stream = sd.OutputStream(device=1, channels=1, callback=callback,
                         samplerate=samplerate)
    with stream:
        await event.wait()

class Sensor(): 
    def __init__(self):
        self.start_time = timer()
        self.target = 5.
        self.threshold = 0.2
        self.sample_frequency = 40.
        self.input_period = 10
        self.data = [0]

    async def read(self):
        while True:
            time = timer()-self.start_time
            self.data.append(self.input(time))
            await asyncio.sleep(1./self.sample_frequency)

    # mimics the input that the sensor would read with 
    # shifted sine wave of period `self.input_period`
    def input(self, time):
        arg = 2 * np.pi * time / self.input_period
        return self.target + np.sin(arg)

sensor = Sensor()

# read sensor data and play audio feedback
async def main():
        try:
            async with asyncio.timeout(10):
                await asyncio.gather(sensor.read(), play_audio(sensor))
        except TimeoutError:
            print("Done")

if __name__ == "__main__":
    asyncio.run(main())
mgeier commented 3 months ago

For reference, this is the same question on SO: https://stackoverflow.com/q/78146705/

There is some talk about threading vs. asyncio ... I would like to point out that when you are using the "callback" mode of PortAudio (which is the underlying library behind the sounddevice module), the callback function will automatically be called in a separate thread by PortAudio.

You could fold all the sensor-handling code into the with stream block, because the callback keeps running in its own thread concurrently anyway.

But you can also keep the additional level of indirection with await if you prefer it ... I don't think it's bad, I just think it's unnecessarily convoluted.

But the real problem is at a different place: you are appending to self.data forever, so this will eventually fill up all your memory. Worse, you are accessing self.data from a different thread, without any synchronization. It seems to me that this would be the perfect situation to use a queue, which would solve both problems.

And, as mentioned in the SO comments, you are never setting the event. Related to that, you are also never using the loop variable.

BTW, in case you are not aware, what you are doing is called "sonification", you might find some interesting things when using this as a search term.

owenhdiba commented 3 months ago

Thank you for your helpful comments. I just merged some of your examples together without understanding the purpose of some of the lines of code, which is why the event and loop aren't used!

Could you explain how the sensor-handling code could be moved to the with stream block? I thought the output buffer is supposed to be filled with audio data from within the callback?

I'll just expand quickly on the SO comments. I'm using asyncio because I connect to the sensor using Bleak ( a Bluetooth client software built with asyncio ) and I'm also using a Bokeh server to display the data in real-time.

How would you suggest using a queue in this situation? I guess I'd need to copy the sensor data into two queues for both consumer processes? Forgive me, I'm new to concurrency concepts.

mgeier commented 3 months ago

Could you explain how the sensor-handling code could be moved to the with stream block?

This is totally untested, but I thought about something like this:

    with stream:
        async with asyncio.timeout(10):
            await sensor.read()

I thought the output buffer is supposed to be filled with audio data from within the callback?

Yes, definitely.

How would you suggest using a queue in this situation?

Write to the queue after reading from the sensor, read from the queue in the audio callback.

Basically what you are already doing with sensor.data, just thread-safe and without filling up all memory.

I guess I'd need to copy the sensor data into two queues for both consumer processes? Forgive me, I'm new to concurrency concepts.

No problem. "process" is probably the wrong word here, that means something else.

It depends if the Bokeh thing runs in a separate thread. If yes, you should probably use another queue, but if not, you probably don't need one.

owenhdiba commented 3 months ago

okay thanks again for your help. Sorry just a couple more questions, and if I'm still without a clue I'll either stick to my original dodgy method or go and do some studying on concurrency in Python.

So you said the callback is done in another thread, so that means I can't use asyncio.queue? Will there not be a problem with using a normal queue if the data is put into the queue from async code?

Would it be as simple as amending the callback so that it does data_queue.get_nowait() instead of sensor.data[-1] for example?

I've got one added complication. I have an external timer with four different states which it cycles through periodically (it spends a different amount of time in each state and the whole cycle takes maybe 30 seconds). How the sound is generated depends on the state of this timer object. At the moment, I am just checking the state of the timer with a series of if-statements in the callback. Would it be better to set an asyncio event for a change in the timer state, and is it feasible to be also using queues to bring in the sensor data?

mgeier commented 3 months ago

So you said the callback is done in another thread, so that means I can't use asyncio.queue? Will there not be a problem with using a normal queue if the data is put into the queue from async code?

You have to use the appropriate queue depending on the situation.

See examples/asyncio_generators.py for an example which uses both types of queues, hopefully correctly.

Writing from an async function into a non-async queue should be fine, as long as it's a non-blocking write. The other direction is a bit more complicated, but that's shown in the example.

Would it be as simple as amending the callback so that it does data_queue.get_nowait() instead of sensor.data[-1] for example?

Something like that, but there might be a few more things that you'll have to adapt.

I've got one added complication. I have an external timer with four different states which it cycles through periodically (it spends a different amount of time in each state and the whole cycle takes maybe 30 seconds). How the sound is generated depends on the state of this timer object. At the moment, I am just checking the state of the timer with a series of if-statements in the callback. Would it be better to set an asyncio event for a change in the timer state, and is it feasible to be also using queues to bring in the sensor data?

It depends on how exactly the timer is written and read. More specifically, whether it's thread-safe. If the timer is only an integer, it's probably fine to write and read it without additional synchronization. If the timer has a thread-safe function to get the current state, that will also be fine.

An asyncio.Event will probably not work, since it's not thread-safe (see https://docs.python.org/3/library/asyncio-sync.html#asyncio.Event). And you can only use it once, not repeatedly.

I guess using a queue would be possible. You can think about it as a "command queue": the timer writes commands into the queue, and the audio callback drains the queue and handles the new command(s) (if any) appropriately.

owenhdiba commented 3 months ago

Thanks for these suggestions. I managed to get both of the features working with queues. I do have a remaining question but it's quite specific.

What's the best approach to minimising the lag between the sensor and the audio?

Let's say the sensor reads in data roughly every T units of time, with some small bounded error on this. If I set the blocksize* in the output stream to be slightly larger than T and use a blocking get method on the queue in the callback, then I'll guarantee that the audio is no more than one block behind the sensor, and they'll never be a problem with there not being data to play. However, the lag is going to grow linearly with the amount of time that I have the stream open.

An alternative is to set the blocksize smaller than T, use get_nowait() method on the queue in the callback and save the current sensor value, so that if there is no data in the queue, it can just use the last sensor value to generate data. It then will take very little time to execute the callback. What's to stop the callback being repeatedly called in-between sensor updates and a whole load of audio data being put into the stream and therefore the lag increasingly growing?

Is all of this determined by the latency argument? Would it be best to maybe call get with a small timeout?

*I realise you recommend not to set the blocksize in the documentation

mgeier commented 3 months ago

You will never get the sensor and the sound card synchronized (unless you have some kind of hardware-level synchronization like word clock, which I assume you don't), so you should implement your audio callback in a way that it can handle more than one incoming value just as well as zero incoming values. Then you can experiment with different block sizes and hear what sounds best.

BTW, you might want to do some parameter interpolation, otherwise the changes of sensor values might sound choppy (sometimes called "zipper noise").

What's to stop the callback being repeatedly called in-between sensor updates and a whole load of audio data being put into the stream

You should be prepared for that situation.

and therefore the lag increasingly growing?

If you always drain the queue, the average lag will not grow meaningfully. There will be some jitter though. That's natural in unsynchronized block-based processing.

Would it be best to maybe call get with a small timeout?

No. You shouldn't block the audio callback. If there is no new value available, you have to come up with something. Most probably just use the previous value. Or do some fancy extrapolation.

*I realise you recommend not to set the blocksize in the documentation

I guess you mean the note in the Stream docs: https://python-sounddevice.readthedocs.io/en/0.4.6/api/streams.html#sounddevice.Stream

I have just taken that nearly verbatim from the PortAudio docs: https://www.portaudio.com/docs/v19-doxydocs/portaudio_8h.html#a443ad16338191af364e3be988014cbbe

I'm (and the PortAudio docs are) not saying to never set the blocksize, but just to set it when there is a good reason to do that. In your case, setting a blocksize might actually make sense, because that will give you a more predictable update behavior.