shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
https://shivammehta25.github.io/Matcha-TTS/
MIT License
602 stars 75 forks source link

cannot synthesize the pronunciation of single word #90

Open xddun opened 1 month ago

xddun commented 1 month ago

It cannot synthesize the pronunciation of a word. For example, when I input "you.", what could be the reason for this? How should I proceed?

image

shivammehta25 commented 1 month ago

It works for me, it is just less than one second, therefore, the gradio interface is showing 0:00! But if you press the play button you should hear it saying briefly. You can verify it at https://huggingface.co/spaces/shivammehta25/Matcha-TTS

xddun commented 1 month ago

Thank you for your response. I tried several more times. This is an issue that occurs occasionally. Sometimes, the voice is not heard, or it is very short. Do you think this can be improved by adjusting the speaking speed?

shivammehta25 commented 1 month ago

Yes! That would be the only solution as it is not an issue. The model synthesises what is asked of it, it's just the generated audio is short.