shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
https://shivammehta25.github.io/Matcha-TTS/
MIT License
729 stars 89 forks source link

question about val loss #111

Open VJJJJJJ1 opened 1 week ago

VJJJJJJ1 commented 1 week ago

thank you for your great work, there I have a question about the val loss: as shown in the figure, some sub loss increases. May I ask how to solve this problem? image

shivammehta25 commented 1 week ago

Everything seems fine to me, this is simply a side effect of having a small validation set. This can be mitigated either by having an extensive validation set or using losses with better behaviour like another flow matching loss for duration predictor which is available in a different branch (https://github.com/shivammehta25/Matcha-TTS/tree/stoc_dur). But I don't think you will gain much for read-speech datasets. However, if you have a spontaneous speech dataset, I recommend trying that.