The reading is different for the sample including spelling errors.

Thank you for great implementation. I used Rayhane-mamah's Tacotron code and r9y9's WaveNet code, then trained models separately. Next, I tried to synthesize the sentence "Thisss isrealy awhsome.", which includes type errors. I want Tacotron2 to read it "This is really awsome." with robustness on spelling errors like DeepMind's original model. ・With using 105k step Tacotron model, It says "This is really " ・With using over 165k step Tacotron model, It says "Thisss isrealy " audiosamples.zip When I used r9y9's pretrained Tacotron model, it also says "This is really ***".

Why is this happening? What seems to be related? Please give me any idea...

*Training situation I used the default codes of Tacotron and trained from scratch with LJSpeech. I used almost default hparams, but adjust some audio properties for using r9y9's WaveNet. And use_lws = T, symmetric_mels = F.

Rayhane-mamah / Tacotron-2

The reading is different for the sample including spelling errors. #437