KoljaB / RealtimeTTS

Converts text to speech in realtime
1.78k stars 159 forks source link

Pass Var to coqui_engine.py from stream.() #27

Closed mercuryyy closed 9 months ago

mercuryyy commented 9 months ago

I am trying to pass a var i can use inside coqui_engine.py _synthesize_worker loop But i am not finding the right way to do it.

So we have our

stream.feed(content)

which is in coqui_engine.py _synthesize_worker text = data['text']

But i want to be able to pass an id like tts_id_file = '95986845' stream.tts_id(tts_id_file)

then in coqui_engine.py _synthesize_worker be able to get inside the loop.

I am using this instead of the output filesname since i am doing custom logic with the chunks i want to be able to name each chunk batch with the unique id.

I know its a bit unusual, i'v been trying for a few hours with no luck ;/

KoljaB commented 9 months ago

To get a single value from outside into _synthesize_worker you want to use the parent_synthesize_pipe. For example, you could add a method like this to CoquiEngine.py:

    def tts_id(self, tts_id: str):
        self.send_command('update_tts_id', {'tts_id': tts_id})
        status, result = self.parent_synthesize_pipe.recv()

Then in _synthesize_worker around here add something like this:

    elif command == 'update_tts_id':
        tts_id = data['tts_id']
        conn.send(('success', 'tts_id updated successfully'))

After that you can call the tts_id method from outside and have the submitted parameter available in _synthesize_worker.

mercuryyy commented 9 months ago

Very much appreciated!

I would have to also edit "text_to_stream.py"

Would be something like

def tts_id(self, tts_id: str):
    return self

File "/home/izzy/text-generation-webui/extensions/openai/script.py", line 183, in generator stream.tts_id('9999999999') ^^^^^^^^^^^^^ AttributeError: 'TextToAudioStream' object has no attribute 'tts_id'

KoljaB commented 9 months ago

I'd just call tts_id on the instance of the coqui engine directly. Stream can contain any engine. While ofc you could pass only to coqui engine if that one is loaded that would require extra steps and not really add anything beneficial I think.

mercuryyy commented 9 months ago

The issue is i can't reload the instance / model

Since i load

engine = CoquiEngine() stream = TextToAudioStream(engine)

[Loop here]

Then i keep the instance Running while looping through audio batches, each batch i need to assign its own tts_id without reloading the model/instance.

KoljaB commented 9 months ago

You could still call engine.tts_id instead of stream.tts_id without the need of reloading or reinstancing? Still not sure what you want to do, send metadata together with the feeded texts? Could you pls provide simple example code of how you want to call the API?

mercuryyy commented 9 months ago

engine = CoquiEngine() stream = TextToAudioStream(engine) def start_audio(): print('start audio fund') stream.play_async() print('played')

while stream.is_playing():

#    time.sleep(0.1)

@app.post('/v1/chat/completions_tts', response_model=ChatCompletionResponse, dependencies=check_key) async def openai_chat_completions_tts(request: Request, request_data: ChatCompletionRequest, background_tasks: BackgroundTasks): path = request.url.path is_legacy = "/generate" in path

if request_data.stream:
    responszzze = []
    response_queue = asyncio.Queue()
    first_feed_done = False
    async def generator():
        nonlocal first_feed_done
        async with streaming_semaphore:
            response = OAIcompletions.stream_chat_completions(to_dict(request_data), is_legacy=is_legacy)
            for resp in response:
                disconnected = await request.is_disconnected()
                if disconnected:
                    break
                if resp['choices'][0]['message']['role'] == "assistant":
                    content = resp['choices'][0]['message']['content']
                    print(content)
                    engine.tts_id('9999999999')
                    stream.feed(content)
                    if not first_feed_done:
                        loop = asyncio.get_running_loop()
                        await loop.run_in_executor(None, start_audio)
                        first_feed_done = True
                        print('Tried to fire play')
                yield {"data": json.dumps(resp)}

    return EventSourceResponse(generator())  # SSE streaming

Is returning


File "/home/izzy/text-generation-webui/extensions/openai/script.py", line 183, in generator engine.tts_id('9999999999') ^^^^^^^^^^^^^ AttributeError: 'TextToAudioStream' object has no attribute 'tts_id'


This is after adding your referenced code - https://github.com/KoljaB/RealtimeTTS/issues/27#issuecomment-1872915076

Basically yes i am trying to send tts_id as metadata with each chunk each time a new API request is called. Without having to reload

engine = CoquiEngine() stream = TextToAudioStream(engine)

between API calls.

KoljaB commented 9 months ago

engine.tts_id('9999999999') ^^^^^^^^^^^^^ AttributeError: 'TextToAudioStream' object has no attribute 'tts_id'

Engine should be a CoquiEngine object at that point, but python thinks it is a TextToAudioStream. Sure it is not overwritten somewhere?

mercuryyy commented 9 months ago

Yeah i have

engine = CoquiEngine() stream = TextToAudioStream(engine)

executing before the functions, and "stream.feed(content)" Works fine, that is why i was trying to do it via

stream. tts_id('9999999999')

I even tried

stream.feed(content, '9999999999')

But got lost someway along the 3rd function i was trying to pass it into.

mercuryyy commented 9 months ago

rlette/sse.py", line 221, in stream_response async for data in self.body_iterator: File "/home/izzy/text-generation-webui/extensions/openai/script.py", line 184, in generator engine.tts_id('9999999999') File "/home/izzy/text-generation-webui/installer_files/env/lib/python3.11/site-packages/RealtimeTTS/engines/coqui_engine.py", line 120, in tts_id status, result = self.parent_synthesize_pipe.recv() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/izzy/text-generation-webui/installer_files/env/lib/python3.11/multiprocessing/connection.py", line 250, in recv return _ForkingPickler.loads(buf.getbuffer()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ EOFError: Ran out of input

After del some cached scripts i am getting this in reference to your suggested code.

mercuryyy commented 9 months ago

I was able to get it working, many thanks @KoljaB and GPT4 ;)

mercuryyy commented 9 months ago

Sorry to reopen ;( i actually get this error randomly in the middle of chunks looping

File "/home/izzy/text-generation-webui/installer_files/env/lib/python3.11/site-packages/sse_starlette/sse.py", line 221, in stream_response async for data in self.body_iterator: File "/home/izzy/text-generation-webui/extensions/openai/script.py", line 186, in generator engine.tts_id('9999999999') File "/home/izzy/text-generation-webui/installer_files/env/lib/python3.11/site-packages/RealtimeTTS/engines/coqui_engine.py", line 506, in tts_id status, result = self.parent_synthesize_pipe.recv() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/izzy/text-generation-webui/installer_files/env/lib/python3.11/multiprocessing/connection.py", line 250, in recv return _ForkingPickler.loads(buf.getbuffer()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ EOFError: Ran out of input

mercuryyy commented 9 months ago

Thank again for the great support i was able to get it to work only by passing stream.feed(content,tts_id)