tuanh123789 / AdaSpeech

An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"
96 stars 27 forks source link

loss rise after 6k #12

Open tuntun990606 opened 1 year ago

tuntun990606 commented 1 year ago

Hello, I use the AIShell dataset to synthesize very poorly, making it difficult to read the entire sentence and only uttering a few syllables. (However, I previously used the Vivos Vietnamese training set to produce a fairly good result. Although I don't understand Vietnamese very well, it is at least fluent, so I think the code should be okay.) I observed your total Loss began to rise after 8K steps, and my model also had similar problems at 6K steps. Besides, my phone level_loss has been in a state of shock. Do you know the probable cause? image image

tuanh123789 commented 1 year ago

in train config "phoneme_level_encoder_step=60000". It mean before 60000 steps, no gradient apply for "phoneme_level_predictor" and i do not add "phone_level_loss" in "total_loss", after 60000 step gradient is apply for "phoneme_level_predictor" and "phone_level_loss" is add to "total_loss" ( Hence the toal loss rise from 60000)

tuanh123789 commented 1 year ago

You can check the loss function here image

tuntun990606 commented 1 year ago

You can check the loss function here image

thank you , i will check it