spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
1.05k stars 149 forks source link

Is it possible or recommended to use a sounddevice stream via fastapi? #569

Open asusdisciple opened 4 days ago

asusdisciple commented 4 days ago

So i want to send some audio stream to my fastapi server which processes it and sends it back to the client. Its pretty straight forward to build a streaming endpoint in FastAPI with:

@app.websocket("/audio-stream/")
async def audio_stream_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            # Receive raw audio data from the client
            audio_data = await websocket.receive_bytes()

            # Process audio data
            pa = streaming_service.process_stream()

            # Send processed audio data back to the client
            await websocket.send_bytes(pa)
    except Exception as e:
        print("Connection closed:", e)
    finally:
        await websocket.close()

However I wondered if it makes sense to use sounddevice streams in this scenario to take advantage of the optimizations which are certainly there in comparison to my naive implementation? The question would be how to implement it. Use sounddevice on the client side and send the stream through fastapi as bytes and the take the audio_data from my endpoint and convert it again to an sounddevice stream to do processing and stuff.

mgeier commented 4 days ago

If you want to send uncompressed PCM data, this should be fairly simple.

I would create an audio callback function that writes into a queue (see examples) and from a different thread (e.g. the main thread) repeatedly check for data in the queue and send it to the server via WebSocket (no example yet, see #415).

If you want to play back the manipulated signal that comes back from the server, I guess you'll need quite a long queue for buffering (depending on the network latency).

If you want to encode/decode the signal before/after sending it over the network, this gets a bit more complicated (and will need additional libraries), but it should be possible, too.