yh1008 / speech-to-text

mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
http://llcao.net/cu-deeplearning17/project.html
70 stars 19 forks source link

WARNING: tree has pdf with no stats #11

Open yh1008 opened 7 years ago

yh1008 commented 7 years ago

during the triphone training, there is a giant sequence of warning complaining Tree has pdf-id x with no stats, like the following:

WARNING (gmm-init-model[5.0.61~1-37b53]:InitAmGmm():gmm-init-model.cc:55) Tree has pdf-id 3 with no stats; correspo
nding phone list: 16 17 18 19 20 
...
WARNING (gmm-init-model[5.0.61~1-37b53]:InitAmGmm():gmm-init-model.cc:55) Tree has pdf-id 442 with no stats; corres
ponding phone list: 1939 1940 1941 1942 
** The warnings above about 'no stats' generally mean you have phones **
** (or groups of phones) in your phone set that had no corresponding data. **
** You should probably figure out whether something went wrong, **
** or whether your data just doesn't happen to have examples of those **
** phones. **

in total, there are

2777 warnings in exp/tri1/log/acc.*.*.log
6578 warnings in exp/tri1/log/align.*.*.log
211 warnings in exp/tri1/log/questions.log
7328 warnings in exp/tri1/log/update.*.log
234 warnings in exp/tri1/log/init_model.log
1 warnings in exp/tri1/log/build_tree.log
yh1008 commented 7 years ago

Professor said, this could be induced by misalignment. I executed the steps/train_deltas.sh with the following command found in Kaldi for Dummies

steps/train_deltas.sh 2000 11000 data/train data/lang exp/mono_ali exp/tri1

but the professor said, steps/train_deltas.sh should take in exp/tri_ali instead of exp/mono_ali , but I am not sure even before training the triphone, where can I get this tri_ali