coqui-ai / xtts-streaming-server

Mozilla Public License 2.0
254 stars 64 forks source link

Streaming input to streaming TTS #10

Open santhosh-sp opened 6 months ago

santhosh-sp commented 6 months ago

Hello Team,

Is it possible to run TTS streaming with streaming input text with same file name?

Example:

def llm_write(prompt: str):

    for chunk in openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        message=[{"role": "user", "content": prompt}],
        stream=True
    ):
        if (text_chunk := chunk["choice"][0]["delta"].get("content")) is not None:
            yield text_chunk

text_stream = llm_write("Hello, what is LLM?")

audio = stream_ffplay(
    tts(
        args.text,
        speaker,
        args.language,
        args.server_url,
        args.stream_chunk_size
    ), 
    args.output_file,
    save=bool(args.output_file)
)

With minimum words to the TTS api.

Thanks, Santhosh

mercuryyy commented 6 months ago

Is this possible?

santhosh-sp commented 6 months ago

Yes, Its possible.

mercuryyy commented 6 months ago

Is it build into the [xtts-streaming-server] repo ? or it has to be tweaked?

I was getting ready to test it out this weekend before i install it.

mercuryyy commented 6 months ago

Any chance you can post some working examples? was able to get the docker working but i dont see any logic for providing the yield chunks as the text to the api

nurgel commented 6 months ago

is splitting at end of sentence (.?!) the best option here?

Fusion9334 commented 5 months ago

def llm_write(prompt: str): buffer = "" for chunk in openai.ChatCompletion.create( model="gpt-3.5-turbo", message=[{"role": "user", "content": prompt}], stream=True ): if (text_chunk := chunk["choice"][0]["delta"].get("content")) is not None: buffer += text_chunk if should_send_to_tts(buffer): # Define this function to decide when to send yield buffer buffer = "" # Reset buffer after sending

text_stream = llm_write("Hello, what is LLM?")

for text in text_stream: audio = stream_ffplay( tts( text, speaker, language, server_url, stream_chunk_size ), output_file, save=bool(output_file) )

AI-General commented 4 months ago

I believe input needs to be at least a sentence, as speech relies heavily on the context provided by subsequent words.

oscody commented 2 weeks ago

def llm_write(prompt: str): buffer = "" for chunk in openai.ChatCompletion.create( model="gpt-3.5-turbo", message=[{"role": "user", "content": prompt}], stream=True ): if (text_chunk := chunk["choice"][0]["delta"].get("content")) is not None: buffer += text_chunk if should_send_to_tts(buffer): # Define this function to decide when to send yield buffer buffer = "" # Reset buffer after sending

text_stream = llm_write("Hello, what is LLM?")

for text in text_stream: audio = stream_ffplay( tts( text, speaker, language, server_url, stream_chunk_size ), output_file, save=bool(output_file) )

does this work