Closed nshmyrev closed 8 years ago
It doesn't work with 2 layers as well. Log output is in /home/ubuntu/g2p-seq2seq/models/nick-cmu/nohup.out
Code is bad in https://github.com/cmusphinx/g2p-seq2seq/commit/cd51a325cad7c92fa62c7e0f988de605525b944d, needs review
We need to figure out why model does not function without target phoneme ids
Training on cmudict 4 layers 64 elements
WER : 0.42332464346 Accuracy : 0.57667535654
2 layers 512 elements
WER : 0.394268272003
Independent test:
WER : 0.386580759831 Accuracy : 0.613419240169
The difference between independent test and test inside training procedure comes from the fact that the latter implicitly feeds reference phonemes in the LSTM history in the current implementation. I was expecting independent test to perform worse...
Testing WER right after training is currently impossible due to a bug in Tensorflow
If you want to achieve higher accuracy, it is necessary to change default value of learning_rate_decay_factor from 0.8 to higher value (for example, to 0.95). But it will take more time to train.
It must be default then. There is no sense to have settings which does not provide best accuracy.
With 0.95 still not good
WER : 0.375360352722 Accuracy : 0.624639647278
Phonetisaurus 36.04
PRONALSYL test (you incorrectly call it uppercase)
nohup python /home/ubuntu/g2p-seq2seq/g2p_seq2seq/g2p.py --train cmudict.dic.train --test cmudict.dic.test --num_layers 2 --size 512 --model model --max_steps 0 --learning_rate_decay_factor 0.99
result
WER : 0.305833333333
Accuracy : 0.694166666667
valid = test same accuracy, 30%. There was a regression
This was magically solved somehow, in new versions accuracy matches the expectation
I run
python /home/ubuntu/g2p-seq2seq/g2p_seq2seq/g2p.py --train cmudict.dict --num_layers 4 --size 64 --model model
I get