syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"
368 stars 110 forks source link

GMM Attention #31

Open ErnstTmp opened 5 years ago

ErnstTmp commented 5 years ago

I am currently at SLT2018 and talked to Daisy Stanton about some problems I had training another Tacotron implementation on my data. She mentioned it is crucial to use GMM Attention. I started to implement it, but then I found there is also a models/gmm_attention_wrapper.py in GST-Tacotron. Is this GMM-Attention working, and if yes, how can it be turned on?

Thanks and kind regards Ernst

syang1993 commented 5 years ago

@ErnstTmp Yes, I heard that GMM Attention is important. In this repo, I tried the included gmm_attention, but I didn't get better results. Maybe there are some problems in this implementation, but I had to do another thing so I don't modify it any more. I would be very appreciate if you can share your implementation about it.

By the way, could you share the collected papers of the SLT 2018?

ErnstTmp commented 5 years ago

@syang1993 If it works, I will definitely share it.I am also not sure if it is needed. Daisy mentioned that it helps to generalize to longer utterances if you train only on speech segments that are 5 secs or shorter. But I had convergence problems with another Tacotron implementation, and she said the missing GMM attention could be the reason. So I have to try...

Re papers - can you write me a PM at ernst.tmp [ at ] gmx [ dot ] at ?

syang1993 commented 5 years ago

Yeah, I was also told that GMM attention is important to get good results. I may try it later to see whether it works.

Thanks, I will send you a email about it.

renerocksai commented 5 years ago

@ErnstTmp I am interested, did you get GMM attention to work?

ErnstTmp commented 5 years ago

@renerocksai I stopped working for it since I found out that my biggest problem was the btach size that prevented me from getting good results. I think min batchsize of 32 is necessary for good results from Tacotron.