k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
938 stars 297 forks source link

fast_beam_search_nbest gives very high WER compared to fast_beam_search and greedy. #1668

Open chirag-augnito opened 5 months ago

chirag-augnito commented 5 months ago

Hi, I am getting around 3% wer in fast-beam-search and greedy-search. However, I am getting 70% WER when I use fast-beam-search-ngram. My decode configuration looks as below. I am using pruned_transducer_stateless7_streaming from librispeech recipe.

Most of my words are getting deleted.

./local/augnito/pruned_transducer_stateless7_streaming/decode.py \
--epoch 19 \
--avg 1 \
--use-averaged-model False \
--exp-dir ${exp_dir} \
--max-duration 200 \
--decode-chunk-len 32 \
--decoding-method fast_beam_search_nbest \
--beam 20.0 \
--max-contexts 8 \
--max-states 16 \
--num-paths 200 \
--ngram-lm-scale 0.01 \
--manifest-dir $manifest_dir
csukuangfj commented 5 months ago

Could you post the error patterns from errs-xxx file?

chiragpatel39 commented 5 months ago

%WER = 73.96 Errors: 1 insertions, 252 deletions, 31 substitutions, over 384 reference words (101 correct) Search below for sections starting with PER-UTT DETAILS:, SUBSTITUTIONS:, DELETIONS:, INSERTIONS:, PER-WORD STATS:

PER-UTT DETAILS: corr or (ref->hyp) utt1: (ADC and high T2 and diffusion and FLAIR signal->e two) measuring utt2: (extension to the prostatic urethra which is displaced posteriorly and to the->*) left side

They are mostly deletions