jik876 / hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
MIT License
1.92k stars 506 forks source link

Quality for some speakers gets worse and worse after 300k steps #68

Open ghost opened 3 years ago

ghost commented 3 years ago

I noticed the quality for some speakers starts to get worse and there start to be some breaking points in the generated waves, but for some other speakers it gets better and better. Is it normal to observe such a thing? Why in the paper says the model was trained for 2.5M steps?

jik876 commented 3 years ago

If you are training a multi-speaker model, it would be a good idea to check whether overfitting is occurring for each speaker. It can be identified by the trend of the spectrogram error in training and evaluation. We chose 2.5M steps for comparison with other models. Depending on datasets, appropriate training steps may be different.