Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation
MIT License
2.27k stars 905 forks source link

Noisy output after training the latest code. #332

Open vinayonchip opened 5 years ago

vinayonchip commented 5 years ago

Hi .! I have trained with the following options:

python train.py --model='Tacotron' ----tacotron_train_steps=300000 Dataset used : LJSpeech-1.1 I have modified the hparams and set the tacotron_batch_size=16 to match my GPU (Nvidia TITAN Xp)

When I run the Inference : python synthesize.py --model='Tacotron' --mode='eval' I get the output wavs with lot of noise. It seems that the attention plots are correct. What can be the reason for this issue. How to fix this..? I think this issue is similar to https://github.com/Rayhane-mamah/Tacotron-2/issues/331

alignment-batch_15_sentence_0 mel-batch_15_sentence_0

wanshun123 commented 5 years ago

Is your output similar to https://github.com/Rayhane-mamah/Tacotron-2/files/2872605/results.zip? I am also getting a noisy output (https://github.com/Rayhane-mamah/Tacotron-2/issues/329). The actual speech is done very well except for the noise if that could be separated out.

vinayonchip commented 5 years ago

Hi @wanshun123 The output is robotic as you said. wav-batch_15_sentence_0-linear.wav.zip