When I iterate 13,000 times, why is the synthesized speech a piece of silence

keithito / tacotron

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

MIT License

2.96k stars 957 forks source link

Open Text2-m opened 5 years ago

keithito commented 5 years ago

It's hard to say without more information, but 13k iterations is probably not enough.

What are you using for training data?
What does your loss curve look like?
Can you the latest alignment image? This should be dumped to the training directory.

vinnitu commented 5 years ago

how many iterations need?

vinnitu commented 5 years ago

japita-se commented 5 years ago

Me too I have 15k steps for Moilla Dataset. The attention plot seems good but the synthesis produces noise.

step-15000-align