Hi,
I use default hparams to train tacotron2 model.
When setting magnitude_power=2, it is easy to learn the attention alignment. But when setting magnitude_power=1, it is difficult to learn the attention alignment.
When setting magnitude_power=1, better voice quality can be obtained by Tacotron2+GLA. But if setting magnitude_power=2, the quality of speech synthesized by Tacotron2+GLA becomes bad.
Could anybody give me some instructions. Any will be appreciated.
Hi,
I use default hparams to train tacotron2 model. When setting magnitude_power=2, it is easy to learn the attention alignment. But when setting magnitude_power=1, it is difficult to learn the attention alignment.
When setting magnitude_power=1, better voice quality can be obtained by Tacotron2+GLA. But if setting magnitude_power=2, the quality of speech synthesized by Tacotron2+GLA becomes bad.
Could anybody give me some instructions. Any will be appreciated.