Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation
MIT License
2.27k stars 905 forks source link

Effects on WaveNet predicted wavs #57

Closed begeekmyfriend closed 6 years ago

begeekmyfriend commented 6 years ago

It seems the decreasing of loss during WaveNet training unsteady. Is it all right for the results or should I wait more steps? The predicted wavs under logs-WaveNet/wavs sound OK but the ones under logs-WaveNet/eval-dir/wavs sound like a mess... image

Yeongtae commented 6 years ago

Has anyone found a solution? Even if you do not release your code, please share results and ideas, how to fix it.

HyperGD1994 commented 6 years ago

@UESTCgan what's your problem? tqdm 0%? I do not use the whole code, i just use part of it, so i did not encounter this problem, but i think this bug will not be difficult to fix, you can try to debug it.

as for the true evaluate problem, have @azraelkuan fix it? the raw input is so hard to train, and the mu-law input seems to have a lot of bugs. However, i modified ibab code with local condition and a bigger net, and i finally got wonderful result with mu-law input. I suggest that you guys can try that.

azraelkuan commented 6 years ago

Indeed, the answer has been given https://github.com/Rayhane-mamah/Tacotron-2/files/2145382/models.tar.gz, because training a mol model need much time about two weeks, so i don't test this, but in my test, the mu law works well. below is the plot of mu law (eval step) image

Yeongtae commented 6 years ago

@azraelkuan Thank you for your answer.

WendongGan commented 6 years ago

@HyperGD1994 Thank for your help!Based on my current 20h Chinese data set, I have trained 240K times. The predicted values produced during training sound good, but only noise is obtained when synthesizing sound. I will try the latest code next.

Yeongtae commented 6 years ago

@azraelkuan Did you test in LJspeech dataset? If you test it, How many iterations have you learned to get the above result?

In my case using 'mulaw', it makes bad wave files with noise. image

azraelkuan commented 6 years ago

@Yeongtae about 40k, i can generate good wav, above is about 300k. attached is 44k. 44000.zip

Yeongtae commented 6 years ago

I had a mistake. I didn't change a value of quantize_channels in hparams.py image

I'm testing with the parameters in above images. I can see that the model is reducing loss. image

@azraelkuan Thank you for your advice.

Rayhane-mamah commented 6 years ago

Thank you all for your valuable contributions, this issue has been fixed with latest commit. If any further problems appear, feel free to open new issues :)