keithito / tacotron

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
MIT License
2.94k stars 965 forks source link

NaN during evaluation #301

Closed Imtinan1996 closed 4 years ago

Imtinan1996 commented 4 years ago

So i have been training the network, along with this I have been evaluating each checkpoint very closely as well, so after every 1000 iterations, when a checkpoint is generated, I serve the model and check the output, upto 11000 steps of training everything was fine, but all of a sudden at 12000 steps my output has started giving NaNs as output, has anyone encountered this problem? Just to be specific, i am replicating this for arabic. Additionally here are my align images

step-11000-align step-12000-align PLease help me, i am stuck at this poitn, i dont know why i am getting NaNs

NaNs were observed by doing this

print(phonetized_text, seq)
print(self.session.run(self.model.linear_outputs[0], feed_dict=feed_dict))

OUTPUT:

{E S L AE M} {AI L Y K M} {Y AE} {SS D Y Q Y}
[42, 48, 29, 14, 32, 11, 35, 29, 28, 41, 32, 11, 28, 14, 11, 50, 38, 28, 43, 28, 1]
[[nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 ...
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]]

UPDATE:

After further testing i observed the following:

I am currently at 110000 steps, but still don't know why this is happening, hopefully with further training this changes

Imtinan1996 commented 4 years ago

UPDATE: So this problem wasnt resolved even after further training upto 400k steps, generated wavs for some input would be nice, for others would be of length 0.

So I went to the source and cleaned up my data, I removed multiple speakers, only took data of one speaker, also some audio files had the person singing songs, those were eliminated as well. Silences were also trimmed, and now after 100k steps of retraining, the NaN issue seems to have resolved. Now I am getting echos at ends of audio files etc, but I guess that is another issue, so im closing this issue for now.