cmusphinx / g2p-seq2seq

G2P with Tensorflow
Other
670 stars 194 forks source link

Last part of the pronunciations skipped by decoder #141

Closed mausamsion closed 6 years ago

mausamsion commented 6 years ago

Hi, I am doing G2P conversion for Japanese words and here are the details of the model: --size = 64 --num_layers = 3 training set = 26,000 words test set = 2,500 words

After 2M steps of training, the behavior shown by the decoder is like below:

  1. True: sh o q p i N g u ch a N n e r u Predict: sh o q p i N g u ch a r u
  2. True: g u q d o n e s u d i f a r e N s u Predict: g u q d o n e s u d i f u m e
  3. True: k u r o m a t i q k u s u k e: r u Predict: k u r o m a t i q k u s u k e:

These are the true and predicted pronunciations. As you can see decoder always skips (or "not-in-mood" to decode) the end part of the words. (There are around 3% of words with this problem in the test set)

Any idea why this is happening ?

nurtas-m commented 6 years ago

I can't give you detailed explanation why Neural Network outputs this kind of result (it is a "black-box" problem). May be the training data have examples with skipping phonemes at the end? Check it, please. You can also try to increase the size of the model.

mausamsion commented 6 years ago

Hey, yes, I already cross-checked the input data and it seems to be correct for every pair of words and pronunciations. Thanks for the suggestion. Regarding this, I was just wondering that when we specify --size 64 --num_layers 3 Q.1 Does the network then have 3 hidden layers or in total 3 layers (including input and output layers) ? Q.2 Also, in case of multiple hidden layers, will all of them have the same number of hidden units (specified by --size) or there is any way to specify hidden units for every hidden layer separately ?

nurtas-m commented 6 years ago

Q.1 Does the network then have 3 hidden layers or in total 3 layers (including input and output layers) ?

Network have 3 hidden layers + input + output layers

Q.2 Also, in case of multiple hidden layers, will all of them have the same number of hidden units (specified by --size) or there is any way to specify hidden units for every hidden layer separately ?

Yes, all of them have the same number of hidden units.

mausamsion commented 6 years ago

Thank you !