Open VJJJJJJ1 opened 1 week ago
Everything seems fine to me, this is simply a side effect of having a small validation set. This can be mitigated either by having an extensive validation set or using losses with better behaviour like another flow matching loss for duration predictor which is available in a different branch (https://github.com/shivammehta25/Matcha-TTS/tree/stoc_dur). But I don't think you will gain much for read-speech datasets. However, if you have a spontaneous speech dataset, I recommend trying that.
thank you for your great work, there I have a question about the val loss: as shown in the figure, some sub loss increases. May I ask how to solve this problem?