wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
4.08k stars 1.07k forks source link

WFST decoding error in pruning #1190

Closed ghost closed 8 months ago

ghost commented 2 years ago

Bug Detail Hello, I'm an engineer who makes a voice recognition model through your wenet library. For your honor, we successfully made our e2e model with reasonable WER. As a result of using various decoding methods, there was no problem with the decoding functions implemented in python. However, when using WFST decoding method which is compiled in runtime C++ code, it made an issue.

The database we use is a mixture of English, Korean, and numbers, and as a result of WFST decoding, it made errors in serveral log file and stopped decoding. Do you know about this issue? In the bottom, I'll show some error log. Thanks.

Bug Result

./tools/decode.sh: line 64: 13127 Segmentation fault (core dumped) decoder_main --rescoring_weight $rescoring_weight --ctc_weight $ctc_weight --reverse_weight $reverse_weight --wav_scp ${dir}/split${nj}/wav.${n}.scp --model_path $model_file --dict_path $dict_file $wfst_decode_opts --result ${dir}/split${nj}/${n}.text & > ${dir}/split${nj}/${n}.log
F0526 17:05:33.687783 21063 determinize-lattice-pruned.cc:1048] Check failed: ifst_->Properties(kTopSorted, true) != 0
*** Check failure stack trace: ***
    @     0x7fcdc3301582  google::LogMessage::Fail()
    @     0x7fcdc33014ca  google::LogMessage::SendToLog()
    @     0x7fcdc3300e0b  google::LogMessage::Flush()
    @     0x7fcdc33046bc  google::LogMessageFatal::~LogMessageFatal()
    @     0x5644cfd35398  fst::LatticeDeterminizerPruned<>::InitializeDeterminization()
    @     0x5644cfd336c9  fst::LatticeDeterminizerPruned<>::Determinize()
    @     0x5644cfd329e7  fst::DeterminizeLatticePruned<>()
    @     0x5644cfcbc394  kaldi::LatticeFasterDecoderTpl<>::GetLattice()
    @     0x5644cfc5eec3  wenet::CtcWfstBeamSearch::FinalizeSearch()
    @     0x5644cfc4224c  wenet::TorchAsrDecoder::AttentionRescoring()
    @     0x5644cfc4022a  wenet::TorchAsrDecoder::Rescoring()
    @     0x5644cfb49ea8  main
    @     0x7fcdc1ea2b97  __libc_start_main
    @     0x5644cfb487fa  _start
    @              (nil)  (unknown)

Is there any solution to a problem like this?

robin1001 commented 2 years ago

I have no idea. There is no such problem in our practice.

github-actions[bot] commented 8 months ago

This issue has been automatically closed due to inactivity.