Open shaynemei opened 2 years ago
Could you compare the decoded results among them? You can use vimdiff
to compare the file recogs-xxx.txt
.
Are there many <UNK>
s in sherpa based decoding for TEDLIUM?
@shaynemei Did you use decode-right-context=2
(the default value) in sherpa. If so, please try decode-right-context=0
. We found that not all models can benefit from right context.
Also, can you show your decoding command for local batch decoding and local streaming decoding, I think the WER difference between them is a little large. Thanks!
local batch decoding command:
./pruned_transducer_stateless5/decode.py \
--epoch 4 \
--avg 1 \
--simulate-streaming False \
--causal-convolution True \
--use-averaged-model False
local streaming decoding command:
./pruned_transducer_stateless5/decode.py \
--epoch 4 \
--avg 1 \
--simulate-streaming True \
--causal-convolution True \
--use-averaged-model False
Actually there isn't any
the utts in the two recogs.txt aren't in the same order, so I couldn't use vimdiff
@shaynemei Did you use
decode-right-context=2
(the default value) in sherpa. If so, please trydecode-right-context=0
. We found that not all models can benefit from right context.
@pkufool
I reran TEDLIUM_DEV
with no right context and got WER: 5.00
Is this 0.28
gap with local streaming (WER 4.72
) expected for sherpa?
@csukuangfj @danpovey @pkufool just following up on this issue. Is there anything else I should provide?
Sorry, I have not looked into it yet. I need to reproduce it locally first.
Do you need any help / additional information for you reproduce it?
Sorry for the late reply. Will look into it during the holiday.
@csukuangfj Do we have any update on this issue? I am seeing a lot of deletion errors with sherpa decoding of streaming zipformer model.
-Sagar
Using the same model (a streaming pruned_transducer_stateless5 trained on gigaspeech), we are experiencing some performance gap between local icefall streaming decoding and sherpa server streaming decoding. WERs for both setup are calculated using the same function here: https://github.com/k2-fsa/icefall/blob/5149788cb2e0730d1537b9711dcfc5c4b11a0f4b/egs/librispeech/ASR/pruned_transducer_stateless5/decode.py#L597-L638
tedlium_dev
: local batch decoding:4.35
local streaming decoding:4.72
sherpa server streaming decoding:5.72