wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
4.14k stars 1.07k forks source link

decode: Required 2147483647 get 634 #1838

Closed boshs closed 10 months ago

boshs commented 1 year ago

I0427 16:52:38.546231 559723 params.h:205] Reading fst data/lang_test/TLG.fst I0427 16:52:38.546422 559723 fst.h:799] FstImpl::ReadHeader: source: data/lang_test/TLG.fst, fst_type: vector, arc_type: standard, version: 2, flags: 0 I0427 16:52:38.575814 559723 params.h:211] Reading symbol table data/lang_test/words.txt I0427 16:52:38.617990 560749 decoder_main.cc:54] num frames 634 I0427 16:52:38.618389 560749 asr_decoder.cc:104] Required 2147483647 get 634 I0427 16:52:39.960561 560749 asr_decoder.cc:200] Partial CTC result 左转 I0427 16:52:40.289940 560749 queue.h:570] AutoQueue: using top-order discipline I0427 16:52:40.353443 560749 asr_decoder.cc:200] Partial CTC result 左转 I0427 16:52:40.751722 560749 decoder_main.cc:54] num frames 397 I0427 16:52:40.752386 560749 asr_decoder.cc:104] Required 2147483647 get 397 I0427 16:52:46.266355 560749 ctc_wfst_beam_search.cc:96] Adding blank frame at symbol 383 F0427 16:52:46.266548 560749 lattice-faster-decoder.cc:362] Check failed: link_extra_cost == link_extra_cost

How to solve it?

pengzhendong commented 1 year ago

How did you generate the TLG.fst? What's the text data you used?

boshs commented 1 year ago

根据https://wenet.org.cn/wenet/lm.html生成的TLG.fst,text是使用自制的数据集

boshs commented 1 year ago

Sorry, I only had the opportunity to continue the previous experiment recently, but using this command still caused this bug. May I ask what caused this

At 2023-05-22 14:10:56, "Marlowe" @.***> wrote:

根据https://wenet.org.cn/wenet/lm.html生成的TLG.fst,text是使用自制的数据集

I met the same problem and solved by doing export LC_ALL=C

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>