Open kannadaraj opened 4 years ago
Are you selecting one style token or using a sound file to sample the style tokens?
@rafaelvalle : Sorry for late reply. I am training with single speaker database, I am using a file sample. from the same data set.
Do the attention maps look correct?
Yes. the attention maps look good. Good diagonal line..
Can you share mel-spectrograms, audio files and attention plots?
Thanks fro sharing the repo. I have trained the model using this repo on LJ speech. I am performing inference using only GST. During inference i use a out of dataset file as style file. The synthesized speaker quality changes very much. The synthesized quality is decent but it doesn't sound like the original speaker of LJ speech. How to fix that? Please can anyone help. Thanks.