spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
1.01k stars 147 forks source link

Asynchronous playback in a loop #213

Open tommy-fox opened 4 years ago

tommy-fox commented 4 years ago

Hello! Firstly, thanks for all your work and examples.

I'm working on a project that takes audio from a file, segments it into overlapping blocks, then feeds those blocks through source separation processing, and finally to loud speakers. I'm using the trio library for asynchronous IO and 2 queues: one for feeding audio into the source separation and one to take audio from source separation to playback.

Right now I am looping over the queue that provides audio blocks from the source separation and calling sd.play() with those blocks. This is producing a bit of latency at each call to sd.play(). I've seen that you suggest creating a custom callback with a stream object as the proper way to handle this but I'm having trouble getting it to work in my set up. I'll include some code below that hopefully gives you a better idea of what is happening.

The second to last code block is where I call sd.play() and is the source of my troubles.

Do you have any suggestions as to how I should implement the callback function and start the stream?

Thank you in advance!

#Read overlapping blocks from file
async def read_blocks_from_file(blocks):
    async with queue_from_input:
        for x in blocks:
            # Convert array to tensor and force audio to false stereo 
            tensor_list = [torch.from_numpy(x[:,0]), torch.from_numpy(x[:,0])]
            audio_buffer = torch.stack(tensor_list)

            # Enqueue audio block
            await queue_from_input.send(audio_buffer)
 # Read audio from the input queue and put in the separated audio queue
async def unmix_separate_streamer(unmix_model):
    async with queue_from_separate:
        async with queue_to_separate:

            # Initialize LSTM hidden state and cell state
            h_t_minus1 = None
            c_t_minus1 = None

            async for audio_buffer in queue_to_separate:
                    # Separate input buffer
                    estimate_buffer, h_t_minus1, c_t_minus1 = await test_stream.separate(audio=audio_buffer.T, 
                                                                                         softmask=True,
                                                                                         alpha=1.0,
                                                                                         targets=['vocals'],
                                                                                         residual_model=False, 
                                                                                         niter=1,
                                                                                         device='cpu', 
                                                                                         model_type='uni',
                                                                                         unmix_target = unmix_model,
                                                                                         h_t_minus1 = h_t_minus1, 
                                                                                         c_t_minus1 = c_t_minus1)
                    # Send separated audio to output
                    await queue_from_separate.send(estimate_buffer['vocals'][FRAME_LENGTH-HOP_LENGTH:,:])
 # Read from the separated audio queue and send to loud speakers
async def write_audio_from_file(rate):
    async with queue_to_output:
        # Read all audio in buffer
        async for out_block in queue_to_output:
            await sd.play(out_block, samplerate=rate)
            await trio.sleep((len(out_block)/rate))`
 def main():  
  # Call read audio, separate, and write audio functions asynchronously
    async with trio.open_nursery() as nursery:

        # Queue to read audio from file and send to separation
        global queue_from_input, queue_to_separate 
        queue_from_input, queue_to_separate = trio.open_memory_channel(math.inf)

        # Queue to read audio from separation and send to output
        global queue_from_separate, queue_to_output
        queue_from_separate, queue_to_output = trio.open_memory_channel(math.inf)

        # Read audio from file and send to separator
        nursery.start_soon(read_blocks_from_file, blocks)

        # Separate audio and send to output queue
        nursery.start_soon(unmix_separate_streamer, unmix_model)

        # Read separated audio from queue and send to speakers 
        nursery.start_soon(write_audio_from_file, RATE)

trio.run(main)
mgeier commented 4 years ago

I've seen that you suggest creating a custom callback with a stream object as the proper way to handle this

Exactly. Don't use sd.play() for this.

Also, sd.play() is not awaitable.

Did you have a look at the two asyncio examples? Do they help?

asyncio_generators.py shows that you need a thread-safe queue (e.g. Python's queue.Queue) to get data in the audio callback. An async queue won't work. I don't know what a "trio memory channel" is, but probably it's not thread-safe? And does math.inf specify the channel's capacity? That's probably not a good idea ... but let's worry about that later ...

Anyway, then you need to write to this thread-safe queue from your coroutine. When the queue is full (because I guess it should have a finite capacity), you'll have to wait a bit before trying again. Normally that's easier when using threads, because then you can simply block on the queue, which won't work in an async context.

If you read overlapping blocks, you will also have to overlap the resulting blocks on playback, right?

Anyway, I strongly suggest to forget all the processing for the moment and just implement block-wise reading from a file and playback of those blocks. Once that works, you can add all the processing you like.

BTW, is using async an external requirement? Things might be easier if you just use threads like we did in the good old days.

mgeier commented 4 years ago

@tommy-fox Any news? Have you been successful?

ChrisGy commented 11 months ago

Hi @tommy-fox, I currently have the exact same issue/requirement. Did you find a workaround you could share?