srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
822 stars 342 forks source link

Dimension mistash when decoding #94

Closed zhangjiulong closed 7 years ago

zhangjiulong commented 7 years ago

I trained a model using my own data, stage1,2 are all ok, and traing process i ok too but when decoding reported errors like this:

net-output-extract --class-frame-counts=/asrDataCenter/dataCenter/modelCenter/asr/tdReArrange/v6000_8k16bit/exp/train_phn_l5_c320/label.counts --apply-log=true /asrDataCenter/dataCenter/modelCenter/asr/tdReArrange/v6000_8k16bit/exp/train_phn_l5_c320/final.nnet 'ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:build/trans/zhangjl10/split1/1/utt2spk scp:build/trans/zhangjl10/split1/1/cmvn.scp scp:build/trans/zhangjl10/split1/1/feats.scp ark:- | add-deltas ark:- ark:- |' ark:-
LOG (net-output-extract:SelectGpuId():cuda-device.cc:77) Manually selected to compute on CPU.
LOG (net-output-extract:DisableCaching():cuda-device.cc:731) Disabling caching of GPU memory.
latgen-faster --max-active=5000 --max-mem=50000000 --beam=17.0 --lattice-beam=8.0 --acoustic-scale=0.7 --allow-partial=true --word-symbol-table=/asrDataCenter/dataCenter/modelCenter/asr/tdReArrange/v6000_8k16bit_segError/data/lang_phn_test/words.txt /asrDataCenter/dataCenter/modelCenter/asr/tdReArrange/v6000_8k16bit_segError/data/lang_phn_test/TLG.fst ark:- 'ark:|gzip -c > build/trans/zhangjl10/eesen/decode/lat.1.gz'
LOG (net-output-extract:ClassPrior():class-prior.cc:33) Computing class-priors from : /asrDataCenter/dataCenter/modelCenter/asr/tdReArrange/v6000_8k16bit/exp/train_phn_l5_c320/label.counts
apply-cmvn --norm-vars=true --utt2spk=ark:build/trans/zhangjl10/split1/1/utt2spk scp:build/trans/zhangjl10/split1/1/cmvn.scp scp:build/trans/zhangjl10/split1/1/feats.scp ark:-
add-deltas ark:- ark:-
LOG (apply-cmvn:main():apply-cmvn.cc:129) Applied cepstral mean and variance normalization to 1 utterances, errors on 0
ERROR (net-output-extract:SubtractOnLogpost():class-prior.cc:82) Dimensionality mismatch, class_frame_counts 75 class_output_llk 77
ERROR (net-output-extract:main():net-output-extract.cc:134) ERROR (net-output-extract:SubtractOnLogpost():class-prior.cc:82) Dimensionality mismatch, class_frame_counts 75 class_output_llk 77
yajiemiao commented 7 years ago

You set the number of labels to 77 during training. However, your label prior file label.counts only contains 75 values. It is possible that the last two labels (indexed by 75 and 76) never appear in the training transcripts, and thus are not counted in label.counts. Resolving this mismatch will get you through decoding.

zhangjiulong commented 7 years ago

what should i do? should I retrain the acoustic model?

yajiemiao commented 7 years ago

You can add two 0s (or 1s to prevent underflow) at the end of label.counts Before doing that, you may want to make sure that 75 and 76 are missing from your training label file exp/train_phn_l5_c320/labels.tr.gz

zhangjiulong commented 7 years ago

ok thx

zhangjiulong commented 7 years ago

done as you suggested. thanks very much.