barronalex / Tacotron

Implementation of Google's Tacotron in TensorFlow
236 stars 80 forks source link

Test won't generate speech used in training while training does generate correctly. #9

Closed onyedikilo closed 7 years ago

onyedikilo commented 7 years ago

I tired to generate a wav file after training for a day using test.py but the result was mostly noise no speech.

I used the same words as in the training data. It correctly renders speech while training, but using the same phrase for test.py results in just noise.

Anyone have any idea what might be the cause of it?

lifeiteng commented 7 years ago

What's the training dataset size?

2017-06-20 21:36 GMT+08:00 Kaan Bey notifications@github.com:

I tired to generate a wav file after training for a day using test.py but the result was mostly noise no speech.

I used the same words as in the training data. It correctly renders speech while training, but using the same phrase for test.py results in just noise.

Anyone have any idea what might be the cause of it?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/barronalex/Tacotron/issues/9, or mute the thread https://github.com/notifications/unsubscribe-auth/AC9r_wCG8Dh_jFNZ7PZBcOaLznmaryjMks5sF8r6gaJpZM4N_mGC .

onyedikilo commented 7 years ago

@lifeiteng , doesn't the test.py just restore the network and send the input phrases from a text file and generate an output? train.py does the same thing but the outputs don't match while the inputs are the same (the input text). So what has this got to do with the data size? Shouldn't the outputs be the same since the network and the inputs are the same?

Btw the data I use has 1900 phrases around 1.5hrs speech.

ljun4121 commented 7 years ago

In evaluation, unlike training, the output of the decoder fed back into the decoder input of next time step, so the result can be different.