liusongxiang / efficient_tts

Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"
MIT License
115 stars 21 forks source link

Inference speed #4

Closed dmazurok closed 3 years ago

dmazurok commented 3 years ago

Hello everyone! Great job! I see that is still has metal sound, but my question is about an inference speed. How does it compare to Tacotron 2? Is it much faster, as the paper says? Could you please tell approximate real time ratio on CPU (and CPU model)? Thank you a lot!

liusongxiang commented 3 years ago

The authors of the EfficientTTS paper only report inference speed on GPU, which is very fast. I do not have a number for Tacotron 2, but I did roughly test the real time factor (RTF) for the current version (EFTS-CNN + HifiGAN-v1) on my machine (Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz). The RTF on CPU from 50 test samples is 1.00356, which is very near to real time. Hope this can help.