alphacep / vosk-tts

Text To Speech Synthesis with Vosk
Apache License 2.0
118 stars 18 forks source link

Try EfficientSpeech #23

Open nshmyrev opened 5 months ago

nshmyrev commented 5 months ago

https://flashspeech.github.io/

https://github.com/roatienza/efficientspeech

nshmyrev commented 5 months ago

Not as fast as promoted, most of the parameters are in hifigan

Using HiFiGAN as vocoder, it runs at an RTF of 1.7 for voice generation on RPi4. Without the vocoder overhead, the mel spectrogram generation is at RTF speed of 104.3