shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
https://shivammehta25.github.io/Matcha-TTS/
MIT License
747 stars 96 forks source link

cannot synthesize the pronunciation of single word #90

Open xddun opened 3 months ago

xddun commented 3 months ago

It cannot synthesize the pronunciation of a word. For example, when I input "you.", what could be the reason for this? How should I proceed?

image

shivammehta25 commented 3 months ago

It works for me, it is just less than one second, therefore, the gradio interface is showing 0:00! But if you press the play button you should hear it saying briefly. You can verify it at https://huggingface.co/spaces/shivammehta25/Matcha-TTS

xddun commented 3 months ago

Thank you for your response. I tried several more times. This is an issue that occurs occasionally. Sometimes, the voice is not heard, or it is very short. Do you think this can be improved by adjusting the speaking speed?

shivammehta25 commented 3 months ago

Yes! That would be the only solution as it is not an issue. The model synthesises what is asked of it, it's just the generated audio is short.