KoljaB / RealtimeTTS

Converts text to speech in realtime
1.41k stars 120 forks source link

Is there any way to return iterator to feed to fastapi StreamingResponse? #39

Open Satyapriya707 opened 4 months ago

Satyapriya707 commented 4 months ago

Hi,

I want to know if there's any way to get an iterator over chunks so that I'll be able to do like this -

chunks = stream.

def gen(): for chunk in chunks: yield chunk

then in fastapi return - StreamingResponse(gen(), media_type="audio/wav")

Thanks

TerryWong1024 commented 3 months ago

@Satyapriya707 Do you resolve it? i encountered the same problem, please specify if you resolved it.

KoljaB commented 3 months ago

Some - very raw - demo code to get you started (I know I should publish a full fastapi realtime TTS server):

from RealtimeTTS import TextToAudioStream, CoquiEngine
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from queue import Queue
import threading
import uvicorn

app = FastAPI()
audio_queue = Queue()

def on_audio_chunk(chunk):
    audio_queue.put(chunk)

def play_text_to_speech(stream, text):
    stream.feed(text)
    stream.play(on_audio_chunk=on_audio_chunk, muted=True)
    audio_queue.put(None)

def audio_chunk_generator():
    while True:
        chunk = audio_queue.get()
        if chunk is None:
            break
        yield chunk

@app.get("/")
def read_root():
    text_to_speak = "This can be played."
    threading.Thread(target=play_text_to_speech, args=(stream, text_to_speak), daemon=True).start()
    return StreamingResponse(audio_chunk_generator(), media_type="audio/wav")

if __name__ == '__main__':
    print("Initializing TTS Engine...")
    engine = CoquiEngine()
    stream = TextToAudioStream(engine)

    print("Starting server ...")
    uvicorn.run(app, host="0.0.0.0", port=8000)

Testclient:

import pyaudio
import requests

pyaudio_instance = pyaudio.PyAudio()
stream = pyaudio_instance.open(format=pyaudio.paInt16, channels=1, rate=24000, output=True)

def get_tts_audio():
    print("Requesting audio ...")
    url = "http://localhost:8000/"
    response = requests.get(url, stream=True)

    if response.status_code == 200:
        stream.start_stream()

        for chunk in response.iter_content(chunk_size=8192):
            if chunk:
                stream.write(chunk)

        stream.stop_stream()
        stream.close()
    else:
        print("Error:", response.status_code)

if __name__ == "__main__":
    get_tts_audio()
TaisukeNigo commented 2 months ago

How to write if using H5 page as the client?

TerryWong1024 commented 2 months ago

Some - very raw - demo code to get you started (I know I should publish a full fastapi realtime TTS server):

from RealtimeTTS import TextToAudioStream, CoquiEngine
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from queue import Queue
import threading
import uvicorn

app = FastAPI()
audio_queue = Queue()

def on_audio_chunk(chunk):
    audio_queue.put(chunk)

def play_text_to_speech(stream, text):
    stream.feed(text)
    stream.play(on_audio_chunk=on_audio_chunk, muted=True)
    audio_queue.put(None)

def audio_chunk_generator():
    while True:
        chunk = audio_queue.get()
        if chunk is None:
            break
        yield chunk

@app.get("/")
def read_root():
    text_to_speak = "This can be played."
    threading.Thread(target=play_text_to_speech, args=(stream, text_to_speak), daemon=True).start()
    return StreamingResponse(audio_chunk_generator(), media_type="audio/wav")

if __name__ == '__main__':
    print("Initializing TTS Engine...")
    engine = CoquiEngine()
    stream = TextToAudioStream(engine)

    print("Starting server ...")
    uvicorn.run(app, host="0.0.0.0", port=8000)

Testclient:

import pyaudio
import requests

pyaudio_instance = pyaudio.PyAudio()
stream = pyaudio_instance.open(format=pyaudio.paInt16, channels=1, rate=24000, output=True)

def get_tts_audio():
    print("Requesting audio ...")
    url = "http://localhost:8000/"
    response = requests.get(url, stream=True)

    if response.status_code == 200:
        stream.start_stream()

        for chunk in response.iter_content(chunk_size=8192):
            if chunk:
                stream.write(chunk)

        stream.stop_stream()
        stream.close()
    else:
        print("Error:", response.status_code)

if __name__ == "__main__":
    get_tts_audio()

@KoljaB WE Can't Wait, I firmly believe that a vast number of individuals are eagerly waiting for this enhancement, and it is indeed a genuine, urgent requirement for the majority of real-world scenarios.

KoljaB commented 2 months ago

I can imagine lots of people want that. I am working on this with a high priority atm. Will probably be ready in few days. Pls remember tho guys I am doing all this for free so you can build products on top. I understand how urgent need for this may be, pls have a bit patience tho so things can be done right.

KoljaB commented 2 months ago

I just added some demo code implementing a fastapi server to stream audio to web applications (browser etc).