Open nshmyrev opened 6 months ago
Not as fast as promoted, most of the parameters are in hifigan
Using HiFiGAN as vocoder, it runs at an RTF of 1.7 for voice generation on RPi4. Without the vocoder overhead, the mel spectrogram generation is at RTF speed of 104.3
https://flashspeech.github.io/
https://github.com/roatienza/efficientspeech