erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
686 stars 71 forks source link

Start audio generation before the text is finished #215

Closed RandomLegend closed 2 months ago

RandomLegend commented 2 months ago

Hello!

i really love this as the oobabooga extension and i was thinking if it would be possible for the extension to grab the text that is generated before the whole answer is fully finished.

And then let it split it up by sentences and as soon as the first sentence is written, it could start generating the TTS from that and start playing it while the rest of the text is still being generated. And i could queue up the next detected sentences as it goes.

Is that a thing? i have no idea tbh and this was just swirling in my head.

erew123 commented 2 months ago

Hi @RandomLegend

I'm not yet sure it will be possible with the Narrator enabled (its quite complicated that), but as for just single voice generation, please see the current PR request here https://github.com/erew123/alltalk_tts/pull/208

Thanks

RandomLegend commented 2 months ago

I don't use narrator anyway. Thanks i didn't see this PR.

I'll check it out!

erew123 commented 2 months ago

Its awaiting TGWUI to import a merge at their end, before I can test+import the merge into AllTalk!

Thanks