yajiemiao / eesen

The official repository of the Eesen project
Apache License 2.0
202 stars 72 forks source link

Lattice Decoding Error #18

Open sanakhamekhem opened 8 years ago

sanakhamekhem commented 8 years ago

I'm building a text recogniser using eesen framework, the training step is realised succesfully.

But the decoding stage fails. I have changed the beam,lattice-beam and acoustic scale many times. The problem still resistent. The command of decoding is: steps/decode_ctc_lat.sh --cmd "$decode_cmd" --nj 1 --beam 18.0 --lattice_beam 8.0 --max-active 5000 --acwt 1.3 \ data/lang_phntest${lm_suffix} data/test_handwritten $dir/decode_testhandwritten${lm_suffix} || exit 1; This is the content of the log file:

net-output-extract --class-frame-counts=exp/train_phn_l2_c140/label.counts --apply-log=true exp/train_phn_l2_c140/final.nnet "ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:data/test_handwritten/split1/1/utt2spk scp:data/test_handwritten/split1/1/cmvn.scp scp:data/test_handwritten/split1/1/feats.scp ark:- |" ark:- | latgen-faster --max-active=5000 --max-mem=50000000 --beam=18.0 --lattice-beam=8.0 --acoustic-scale=1.3 --allow-partial=true --word-symbol-table=data/lang_phn_test_tg/words.txt data/lang_phn_test_tg/TLG.fst ark:- "ark:|gzip -c > exp/train_phn_l2_c140/decode_test_handwritten_tg/lat.1.gz"

Started at Thu Sep 1 17:55:04 CEST 2016

net-output-extract --class-frame-counts=exp/train_phn_l2_c140/label.counts --apply-log=true exp/train_phn_l2_c140/final.nnet 'ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:data/test_handwritten/split1/1/utt2spk scp:data/test_handwritten/split1/1/cmvn.scp scp:data/test_handwritten/split1/1/feats.scp ark:- |' ark:- LOG (net-output-extract:SelectGpuId():cuda-device.cc:77) Manually selected to compute on CPU. LOG (net-output-extract:DisableCaching():cuda-device.cc:731) Disabling caching of GPU memory. latgen-faster --max-active=5000 --max-mem=50000000 --beam=18.0 --lattice-beam=8.0 --acoustic-scale=1.3 --allow-partial=true --word-symbol-table=data/lang_phn_test_tg/words.txt data/lang_phn_test_tg/TLG.fst ark:- 'ark:|gzip -c > exp/train_phn_l2_c140/decode_test_handwritten_tg/lat.1.gz' LOG (net-output-extract:ClassPrior():class-prior.cc:33) Computing class-priors from : exp/train_phn_l2_c140/label.counts apply-cmvn --norm-vars=true --utt2spk=ark:data/test_handwritten/split1/1/utt2spk scp:data/test_handwritten/split1/1/cmvn.scp scp:data/test_handwritten/split1/1/feats.scp ark:- AHTD3A0002_Para2_1 ahA heM yaB naA sp aeA daM ayA sp waM aeA naA sp taB deB yaA sp aeA raM aaA hhA aaA taA sp comA aeA naA sp aaA naA sp aeA naA bslA yaA sp aeA ayE sp aeA yaA sp aaA kaA sp aeA naA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para2_1 is 0.847503 over 873 frames. AHTD3A0002_Para2_2 ghM taE sp waM aeA naA sp aeA naB waM aaA ayA sp aeA heM maE sp ahA naA sp aaA ayA sp aeA naA sp taB aaE naA sp waM aeA naA bslA kaM yaA sp aaA baM haE aaE aeA raM deA sp dotA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para2_2 is 0.79611 over 923 frames. AHTD3A0002_Para2_3 ahA raM aaA hhA taA sp aeA heM laA sp aeA haE sp ayA sp sp waM aeA deB dotA sp waM aeA naA sp ghB yaB toB sp comA aeA heM broA aeA naA sp aaA naA sp aeA naA sp aeA naA sp dbqA aaA deA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para2_3 is 0.828762 over 936 frames. AHTD3A0002_Para2_4 alA heM aaE zhaA sp ayA sp hhA heM yaB alM aaE sp sp zhaA sp equA aaE sp ahA naA sp aeA yaA sp dotA sp sp aeA alA heM aeE yaA sp ahA naA sp aeA alE sp aeA naA sp taB ghM toB yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para2_4 is 0.832251 over 892 frames. AHTD3A0002_Para3_1 ahA naA sp aeA naA sp aaA ayA sp aeA naA sp dotA dotA sp sp seM aeE jaB yaA sp aeA yaA sp dotA sp dhM aaA sp aeA yaA sp aaA kaA sp ahA naA sp taB yaA sp ahA raM aeA yaA sp dotA sp baE LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para3_1 is 0.851592 over 914 frames. AHTD3A0002_Para3_2 raM fslA hhA sp ayA sp waM aeA naA sp aaA yaA sp aeA raM aaA ayA sp sp aeA deB dotA sp thB maE sp dotA heM ahA broA aeA heM maE sp dbqA dotA taE sp aeA naA sp comA heA alM aaE aeA aaA hhA aaA taA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para3_2 is 0.857723 over 885 frames. AHTD3A0002_Para3_3 ghA bslA aeE naB waM aaA ayA sp aaA ayA sp aeA yaA sp dotA sp waM aeA naA sp aaA ayA sp comA aeA naA sp taB haE alM yaA sp broA aeA naB shM aeE dbqA alB kaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para3_3 is 0.835602 over 858 frames. AHTD3A0002_Para3_4 waM heM yaA sp sp shA aeA naA sp jaB waM aeA naA sp waM aeA naA sp heM yaA sp aeA waM aaA kaM ayM taE sp aeA ahA naA sp n3A aeE yaA sp dbqA sp aeA ayE sp aeA yaA sp dotA dotA dbqA comA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para3_4 is 0.857684 over 871 frames. AHTD3A0002_Para3_5 heM yaA sp aeA daM heM aaE sp aeA naA sp aaA naA sp aeA naA sp dotA dotA sp sp aeA jaB daM sp aeA aeE jaB haE sp waM aeA naA sp aeA yaA sp aeA naA sp waM aeA naA sp heM wlE yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para3_5 is 0.841021 over 854 frames. AHTD3A0002_Para3_6 naA sp aeA yaA sp aaA jaB haE sp aeA yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para3_6 is 0.948729 over 180 frames. AHTD3A0002_Para4_1 waM aaA aaE aeA naA sp sp aeA shM yaB alE alM aeA deB amE ayA sp aeA ayE ayM alB ayE sp aeA naA sp ahA naA sp aeA naA sp aeA naA sp comA sp aeA yaA sp dotA n6A aeA saM laA sp aaA ayB yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_1 is 0.866579 over 912 frames. AHTD3A0002_Para4_2 taB wlE daM yaA sp ahA raM maE sp aaA raM ayA sp aeA naA sp dhM broA aeA raM deB dotA sp aeA naA sp aaA laB eeE sp aeA naA sp faM yaA sp ahA naA sp deA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_2 is 0.891789 over 847 frames. AHTD3A0002_Para4_3 seM naA sp sp dotA sp aeA aaA ayA sp aeA naA sp yaB zaE aaA laA sp aeA yaA sp dotA sp waM heM yaA sp aeA yaA sp ahA naA aeE yaA sp aeA daM aaA hhA sp aeA baM haE sp dotA aeE yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_3 is 0.867084 over 870 frames. AHTD3A0002_Para4_4 alA haM aeA yaA sp waM heM yaA sp broA aeA ghB yaB alM yaA sp aeA deB comA aeA raM deB dotA dotA alA sp aeA dotA dotA heM laA sp aeA yaA sp ahA aeA aeE naB hhE sp aeA ayE heM daM sp faM yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_4 is 0.834662 over 869 frames. AHTD3A0002_Para4_5 heM yaA sp aeA raM deB yaA sp aaA ayA aeA alA hypA sp aeA seM aeE yaA sp aeA jaB daM sp aeA naA sp aeA naA bslA daM ayA sp daM aeA naA kaM yaA sp faM yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_5 is 0.870726 over 830 frames. AHTD3A0002_Para4_6 haE sp aeA naA sp aeA naA sp aaA yaA sp aeA aaA deA aeA ayE sp khM aeA seM ayM alB yaA sp yaB aeE keB laA sp aeA laB aeA naA sp maB naA sp aeA yaA sp aeA yaA sp naB aaE sp aeA naB yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_6 is 0.885211 over 903 frames. AHTD3A0002_Para4_7 n1A n2A aeA yaA sp aeA seM aeE aeE raM deB comA aeA seM raM taB heE sp aeA yaA sp aeA naA sp heM yaA sp aeA deB keB aeE naA sp aeA naA bslA hhA keB naA sp aeA naA sp kaM raM naA sp taB aeE yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_7 is 0.83318 over 913 frames. AHTD3A0002_Para4_8 aaA heA sp heA sp waM aeA naA sp ghA sp seM amE dotA sp sp seM aeE naA sp aeA seM ayE n9A aeA waM aaA raM yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0002_Para4_8 is 0.886301 over 659 frames. AHTD3A0003_Para4_3 dotA heM ayB yaA sp aaA yaA sp aeA naB hhE ayB aeA heM daM aaA ayA sp aeA naA sp aeA naA sp aaA raM aaA ayA sp hypA sp aeA ghA aeE yaA sp aeA waM jaB yaA sp aeA naA sp aeA ghA bslA ayM yaB ayE sp aeA seM kaM yaA sp maB eeE sp keB naA sp aeA naA sp ghB raM aaA deA aeA taB yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0003_Para4_3 is 0.825307 over 954 frames. AHTD3A0003_Para4_4 naB ayM alB ayE aeE saM laA sp aeA maB maE sp aeA yaB heE sp shM aaE raM heA sp maE sp n1A n2A sp aeA seM aeE aeE sp aeA alE sp aeA ghA bslA alB yaA sp aeA aaA deA aeA n3A dbqA ayB aeA shM yaB hhE sp ghA sp heM sp sp ayA sp aeA naA sp jaB raM yaB baE LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0003_Para4_4 is 0.814618 over 924 frames. AHTD3A0003_Para4_5 aaA seM taM aeE jaB ayM n1A n5A n0A maE sp aeA seM alM yaA sp thB maE sp aeA amE dotA n7A maE aeE yaA sp n8A n2A n9A sp aeA aeE haE sp aaA yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0003_Para4_5 is 0.925499 over 745 frames. AHTD3A0004_Para1_1 raM aaA deA sp sp aeA shM yaB alE sp amA alM amE naB yaB ayE ayM alB ayE sp aeA yaA sp aeA naA sp ahA brcA dotA keB naA sp aeA naA sp dotA n6A alB alE sp aeA naB daM yaA sp yaB wlE daM yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_1 is 0.892255 over 988 frames. AHTD3A0004_Para1_2 raM aaA ayA sp aeA deB yaA sp raM aeA yaA sp brcA alB eeE aeA taA sp faM yaA sp shM alE deA sp zhaA alB ghB dotA sp aeA yaA sp scrA ghE ghM yaB raM deB yaB naA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_2 is 0.826241 over 1006 frames. AHTD3A0004_Para1_3 heM yaA sp kaM raM naA sp aeA shM ayM taM aeE sp aeA daM dbqA aeE sp aeA yaA sp dbqA dotA aeE raM alA sp aeA heA dotA sp waM heM yaA sp aeA ghB ayA sp dotA sp waM aeA naA sp deB dotA sp baE sp aeA naA sp heM yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_3 is 0.820131 over 990 frames. AHTD3A0004_Para1_4 waM aaA laB aeE naA sp aeA heM daM alA sp faM heM yaA sp aeA raM deB yaA sp aaA ayA aeA alA hypA deA aeA seM aeE yaA sp aeA waM jaB yaA sp aeA naA sp aeA naA sp amA ayA ayE daM aeA naA sp aaA hhA heM maB eeE keB bslA yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_4 is 0.828792 over 1031 frames. AHTD3A0004_Para1_5 ahA naA sp aeA aaA deA sp heM yaA sp aeA seM ayM alB ayE sp aeA laA sp aeA yaA sp aeA maB raM aeA seA sp aeA yaA sp aeA aaE alB jaE sp maE sp aeA yaA sp aeA seM aeE aeE ayA sp comA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_5 is 0.88772 over 1011 frames. AHTD3A0004_Para1_6 ghA bslA alB eeA eeE aaA deA aeA n3A dbqA bslA ayB aeA deB yaB hhE naA sp aeA naA sp heM daM alA ayA sp aeA naA bslA hhA taB heM dotA sp aeA naA sp aaA naA sp aeA naA sp jaB daM yaA sp aeA raM heA alA sp waM aeA naA sp ahA naA sp aeA yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_6 is 0.805979 over 1014 frames. AHTD3A0004_Para1_7 amE dotA n7A n5A aeE naA sp aeA waM aeA naA sp waM aeA naA sp aeA yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para1_7 is 0.912582 over 368 frames. AHTD3A0004_Para2_1 ahA naA sp jaB ayE sp shM aeE naA sp aeA daM naA sp waM heM hhA keB aeE naA sp aaA naA sp aeA naA sp n3A heM raM deA sp ayA sp waM aeA naA sp hhA ayM daM sp aeA naA hhA sp aeA naA sp dotA sp waM aeA raM deA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para2_1 is 0.872719 over 991 frames. AHTD3A0004_Para2_2 ghA sp heA sp aeA yaA sp aaA ayA sp hhA sp broA aeA yaA sp aeA naA sp zhaA sp ayA sp aeA deB yaA sp baM aeE raM aaA deB yaA sp dotA fslA brcA crcA aeE naA sp aeA raM deA sp hypA deA sp heM daM alA sp waM waM yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para2_2 is 0.844068 over 1052 frames. AHTD3A0004_Para2_3 zhaA sp dotA ayM raM deA sp aaA ayA sp sp dotA sp waM ahA naA sp ahA naA sp aeA alA hypA deA aeA faE sp saA sp ghA sp sp aeA deB yaA sp aeA naA sp jaB raM yaB baE sp aeA naA sp taB keB naA sp aeA deB brcA aeA naA sp aaA yaA sp aeA seM ayE sp aaA yaA LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para2_3 is 0.868646 over 1087 frames. AHTD3A0004_Para2_4 ghA bslA alA sp deA aeA ghA sp heM sp dotA sp waM aeA deB aeE sp dhM yaA sp shA sp dotA aeE aaA aaA yaA sp deB raM alA sp aeA naB heM dotA dotA sp ghA sp aeA deB yaA sp aaA yaA sp aeA seM haE sp aaA ayA sp waM aeA naA sp keB laA sp aeA yaA sp taB aeE jaB naB baE LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AHTD3A0004_Para2_4 is 0.839514 over 1090 frames. WARNING (latgen-faster:ProcessNonemitting():lattice-faster-decoder.cc:772) Error, no surviving tokens: frame is 61 WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] WARNING (latgen-faster:PruneTokensForFrame():lattice-faster-decoder.cc:456) No tokens alive [doing pruning] KALDI_ASSERT: at latgen-faster:PruneForwardLinks:lattice-faster-decoder.cc:314, failed: link_extra_cost == link_extracost Stack trace is: eesen::KaldiGetStackTrace() eesen::KaldiAssertFailure(char const, char const, int, char const) eesen::LatticeFasterDecoder::PruneForwardLinks(int, bool, bool, float) eesen::LatticeFasterDecoder::PruneActiveTokens(float) eesen::LatticeFasterDecoder::Decode(eesen::DecodableInterface) eesen::DecodeUtteranceLatticeFaster(eesen::LatticeFasterDecoder&, eesen::DecodableInterface&, fst::SymbolTable const, std::string, double, bool, bool, eesen::TableWritereesen::BasicVectorHolder, eesen::TableWritereesen::BasicVectorHolder, eesen::TableWritereesen::CompactLatticeHolder, eesen::TableWritereesen::LatticeHolder, double) latgen-faster(main+0x83a) [0x8191127] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xb6b82a83] latgen-faster() [0x8190810] bash: line 1: 3076 Broken pipe net-output-extract --class-frame-counts=exp/train_phn_l2_c140/label.counts --apply-log=true exp/train_phn_l2_c140/final.nnet "ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:data/test_handwritten/split1/1/utt2spk scp:data/test_handwritten/split1/1/cmvn.scp scp:data/test_handwritten/split1/1/feats.scp ark:- |" ark:- 3077 Aborted (core dumped) | latgen-faster --max-active=5000 --max-mem=50000000 --beam=18.0 --lattice-beam=8.0 --acoustic-scale=1.3 --allow-partial=true --word-symbol-table=data/lang_phn_test_tg/words.txt data/lang_phn_test_tg/TLG.fst ark:- "ark:|gzip -c > exp/train_phn_l2_c140/decode_test_handwritten_tg/lat.1.gz"

Accounting: time=274 threads=1

Ended (code 134) at Thu Sep 1 17:59:38 CEST 2016, elapsed time 274 seconds