jaywalnut310 / glow-tts

A Generative Flow for Text-to-Speech via Monotonic Alignment Search
MIT License
667 stars 150 forks source link

Gradient overflow and negative loss #15

Closed Charlottecuc closed 4 years ago

Charlottecuc commented 4 years ago

Hi. I tried to train the model by using a 24-hour mandarin dataset and encountered the following gradient overflow and negative loss problem. Screenshot 2020-06-11 at 2 38 40 PM Screenshot 2020-06-11 at 2 38 34 PM Screenshot 2020-06-11 at 2 44 01 PM Screenshot 2020-06-11 at 2 44 20 PM Screenshot 2020-06-11 at 2 44 28 PM Screenshot 2020-06-11 at 2 44 39 PM

I only changed the "data" part of the config file and modified the "text" folder (cmudict.py, & symbols.py by adding some mandarn phonemes Screenshot 2020-06-11 at 2 47 09 PM ): Screenshot 2020-06-11 at 2 42 45 PM

Could you give me any suggestion? Thank you!

Charlottecuc commented 4 years ago

Besides, I successfully trained the model by using the LJ dataset and the synthezised voice is quite well. I also noticed that during the training process, the l_mle_normal is negative: Screenshot 2020-06-11 at 5 40 10 PM Screenshot 2020-06-11 at 5 42 30 PM

My question is, why the loss will become negative?

Thank you!

LiNaihan commented 4 years ago

I think the negative loss is normal, since the loss is log- scalar, which can be negative.

However, I encountered another issue that the grad_norm is larger than you (starting from ~50 and decrease to ~5, then increase to ~10), while it seems that your grad norm is always ~1.

jaywalnut310 commented 4 years ago

@Charlottecuc Yes, the negative loss is normal, as it stans for negative log-likelihood of data. If you encounter NAN loss, I have no perfect solution, but I can give you a suggestion.

You can change configs add_noise to false, and fp16_run to false. Switching on/off theses configs may help to increase numerical stability.

Charlottecuc commented 4 years ago

@jaywalnut310 Hi. I changed add_noiseto false and it seems that the model is more stable than before. Besides, after changing the "add_noise" to false, do I also need to change the value of "noise_scale" in the inference time? Thanks!

jaywalnut310 commented 4 years ago

add_noise is about whether to add noise in the input data or not. It is not related with noise_scale. At inference time, you can change noise_scale between 0 to 1 for finding best sample quality.