Thank you for great implementation.
I used Rayhane-mamah's Tacotron code and r9y9's WaveNet code, then trained models separately. Next, I tried to synthesize the sentence "Thisss isrealy awhsome.", which includes type errors. I want Tacotron2 to read it "This is really awsome." with robustness on spelling errors like DeepMind's original model.
・With using 105k step Tacotron model, It says "This is really "
・With using over 165k step Tacotron model, It says "Thisss isrealy "
audiosamples.zip
When I used r9y9's pretrained Tacotron model, it also says "This is really ***".
Why is this happening? What seems to be related? Please give me any idea...
*Training situation
I used the default codes of Tacotron and trained from scratch with LJSpeech. I used almost default hparams, but adjust some audio properties for using r9y9's WaveNet.
And use_lws = T, symmetric_mels = F.
Thank you for great implementation. I used Rayhane-mamah's Tacotron code and r9y9's WaveNet code, then trained models separately. Next, I tried to synthesize the sentence "Thisss isrealy awhsome.", which includes type errors. I want Tacotron2 to read it "This is really awsome." with robustness on spelling errors like DeepMind's original model. ・With using 105k step Tacotron model, It says "This is really " ・With using over 165k step Tacotron model, It says "Thisss isrealy " audiosamples.zip When I used r9y9's pretrained Tacotron model, it also says "This is really ***".
Why is this happening? What seems to be related? Please give me any idea...
*Training situation I used the default codes of Tacotron and trained from scratch with LJSpeech. I used almost default hparams, but adjust some audio properties for using r9y9's WaveNet. And use_lws = T, symmetric_mels = F.