KoljaB / RealtimeTTS

Converts text to speech in realtime
1.78k stars 159 forks source link

Small update for your README.md #17

Closed erew123 closed 10 months ago

erew123 commented 10 months ago

I've just been doing a load of work with the Coquii TTS engine and I thought it wanted 24000Hz for sample files. Turns out as standard it wants 22050Hz. They both work, but if you look in the config.json file that comes downloaded with the models, it has a set preference for 22050Hz as the input file (and yes, mono 16 bit etc).

I was just taking an interest in your RealtimeTTS and thinking of pulling it into my project and spotted https://github.com/KoljaB/RealtimeTTS#coquiengine figured you may want to update it.

image

Thanks

KoljaB commented 10 months ago

Thank you very much for pointing this out. Made some quality tests with 44100, 22050 and 24000Hz and felt 24kHz delivered best, now seeing this in the config.json just proves me wrong, changed in 0.3.31.

erew123 commented 10 months ago

You're going to kill me, but I only found out in the last hour or so, they updated TTS engine twice and https://github.com/coqui-ai/TTS/discussions/3340 0.21.2 (its now 0.21.3 latest) has fixes for how it splits sentences, so should clear up some strange sound generations issues... you know, when it makes funny sounds that aren't speech (supposedly fixes it).

Had I known earlier, I would have mentioned that too!

KoljaB commented 10 months ago

Yeah their full release notes really reminded me of 0.2.7 😂 Both libs support custom sentence preprocessing logic, you could currently just copy their logic and insert into RealtimeTTS to get the same effect. There will be an update to tts==0.21.3 soon but they really implemented some stuff of v0.2.7 which forces me to think about linking it all back to their lib.