kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.03k stars 5.3k forks source link

gmm-latgen-simple is broken #4870

Closed csukuangfj closed 9 months ago

csukuangfj commented 11 months ago
diff --git a/egs/wsj/s5/steps/decode.sh b/egs/wsj/s5/steps/decode.sh
index 8c85724c0..ab12bdcc4 100755
--- a/egs/wsj/s5/steps/decode.sh
+++ b/egs/wsj/s5/steps/decode.sh
@@ -122,7 +122,7 @@ if [ $stage -le 0 ]; then
       { echo "$0: Error: Mismatch in number of pdfs with $model"; exit 1; }
   fi
   $cmd --num-threads $num_threads JOB=1:$nj $dir/log/decode.JOB.log \
-    gmm-latgen-faster$thread_string --max-active=$max_active --beam=$beam --lattice-beam=$lattice_beam \
+    gmm-latgen-simple --beam=$beam --lattice-beam=$lattice_beam \
     --acoustic-scale=$acwt --allow-partial=true --word-symbol-table=$graphdir/words.txt $decode_extra_opts \
     $model $graphdir/HCLG.fst "$feats" "ark:|gzip -c > $dir/lat.JOB.gz" || exit 1;
 fi

With the above changes, the following code breaks

cd egs/yesno/s5
./run.sh

with the following error logs:

steps/decode.sh --nj 1 --cmd utils/run.pl exp/mono0a/graph_tgpr data/test_yesno exp/mono0a/decode_test_yesno
decode.sh: feature type is delta
run.pl: job failed, log is in exp/mono0a/decode_test_yesno/log/decode.1.log
grep: exp/mono0a/decode_test_yesno/wer_*: No such file or directory
(py38) kuangfangjun:s5$ cat exp/mono0a/decode_test_yesno/log/decode.1.log
# gmm-latgen-simple --beam=13.0 --lattice-beam=6.0 --acoustic-scale=0.083333 --allow-partial=true --word-symbol-table=exp/mono0a/graph_tgpr/words.txt exp/mono0a/final.mdl exp/mono0a/graph_tgpr/HCLG.fst "ark,s,cs:apply-cmvn  --utt2spk=ark:data/test_yesno/split1/1/utt2spk scp:data/test_yesno/split1/1/cmvn.scp scp:data/test_yesno/split1/1/feats.scp ark:- | add-deltas  ark:- ark:- |" "ark:|gzip -c > exp/mono0a/decode_test_yesno/lat.1.gz"
# Started at Wed Aug 30 00:49:23 CST 2023
#
gmm-latgen-simple --beam=13.0 --lattice-beam=6.0 --acoustic-scale=0.083333 --allow-partial=true --word-symbol-table=exp/mono0a/graph_tgpr/words.txt exp/mono0a/final.mdl exp/mono0a/graph_tgpr/HCLG.fst 'ark,s,cs:apply-cmvn  --utt2spk=ark:data/test_yesno/split1/1/utt2spk scp:data/test_yesno/split1/1/cmvn.scp scp:data/test_yesno/split1/1/feats.scp ark:- | add-deltas  ark:- ark:- |' 'ark:|gzip -c > exp/mono0a/decode_test_yesno/lat.1.gz'
add-deltas ark:- ark:-
apply-cmvn --utt2spk=ark:data/test_yesno/split1/1/utt2spk scp:data/test_yesno/split1/1/cmvn.scp scp:data/test_yesno/split1/1/feats.scp ark:-
ERROR (gmm-latgen-simple[5.5.1057~2-be222]:ProcessNonemitting():lattice-simple-decoder.cc:574) Error in ProcessEmitting: no surviving tokens: frame is -1

[ Stack-Trace: ]
gmm-latgen-simple(kaldi::MessageLogger::LogMessage() const+0xb42) [0x56134a6471ee]
gmm-latgen-simple(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x56134a4a7903]
gmm-latgen-simple(kaldi::LatticeSimpleDecoder::ProcessNonemitting()+0x1cd) [0x56134a4b6fc3]
gmm-latgen-simple(kaldi::LatticeSimpleDecoder::InitDecoding()+0xe3) [0x56134a4b7419]
gmm-latgen-simple(kaldi::LatticeSimpleDecoder::Decode(kaldi::DecodableInterface*)+0x11) [0x56134a4b83ab]
gmm-latgen-simple(kaldi::DecodeUtteranceLatticeSimple(kaldi::LatticeSimpleDecoder&, kaldi::DecodableInterface&, kaldi::TransitionInformation const&,
fst::SymbolTable const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, bool, bool, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >*, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >*, kaldi::TableWriter<kaldi::CompactLatticeHolder>*, kaldi::TableWriter<kaldi::LatticeHolder>*, double*)+0x83) [0x56134a4e06cc]
gmm-latgen-simple(main+0xcf5) [0x56134a4a5c2f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f67dd224bf7]
gmm-latgen-simple(_start+0x2a) [0x56134a4a4e5a]

WARNING (gmm-latgen-simple[5.5.1057~2-be222]:Close():kaldi-io.cc:515) Pipe apply-cmvn  --utt2spk=ark:data/test_yesno/split1/1/utt2spk scp:data/test_yesno/split1/1/cmvn.scp scp:data/test_yesno/split1/1/feats.scp ark:- | add-deltas  ark:- ark:- | had nonzero return status 36096
kaldi::KaldiFatalError# Accounting: time=0 threads=1
# Ended (code 255) at Wed Aug 30 00:49:23 CST 2023, elapsed time 0 seconds