duration not predicted correctly

as-ideas / TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Other

1.13k stars 227 forks source link

I'm training on a custom dataset. The issue is, generated mels (after training forward) aren't equal to the ground truth mels. Due to this WaveRNN could not be trained as some datums would get corrupted during window calculation here.

One corrupted datum looks like this (mel,label pair)

mel_shape = (80, 311) sig_offset = 79200 label shape (ground truth signal) = (77626,) Label window shape = (0,)

See, the sig_offset value exceeds the length of the signal. Is there any mistake on my part or any suggestions?

Branch: master Commit: e4ded5b

as-ideas / TransformerTTS

duration not predicted correctly #95