keithito / tacotron

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
MIT License
2.96k stars 956 forks source link

Poor alignment after 37000 steps #123

Open duvtedudug opened 6 years ago

duvtedudug commented 6 years ago

step-37000-align

This is my alignment after 37000 steps. Should it be better results by now?

It seems quite slow to me.

Any ideas would be greatly appreciated! Thanks, Duvte

quadraaa commented 6 years ago

Maybe someone could explain, how does it come that the loss is small (0.06315 in this case), but the alignment isn't learned? I am having very similar issues...

duvtedudug commented 6 years ago

@Quadraaa I am trying to learn a modified mel spectrogram (dynamic range reduced), in an effort to get tacotron to create input for @r9y9 wavenet vocoder. https://github.com/r9y9/wavenet_vocoder/issues/22

I presume that is why the loss is different.

Any ideas on why it's not aligning would be greatly appreciated!

quadraaa commented 6 years ago

@duvtedudug Sorry, maybe, I wasn't quite clear. My question was more general, not about your case - how can the model successfully learn (I assume that the model successfully learns, because the loss is small) without attending to the right places while decoding?