k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
792 stars 267 forks source link

initial decoder input in onnx decoding results in deletion errors #1598

Open 1215thebqtic opened 1 month ago

1215thebqtic commented 1 month ago

Hi,

I'm using python scripts to decode onnx models, and I found deletion errors in some testsets (cer = 8.8), especially for the first few words in a sentence. However, pytorch model decoding does not have this kind of errors (cer = 7.43).

I found the initial decoder input in onnx decoding is [[0,0],[0,0]], and it's [[-1,0],[-1,0]] in pytorch decoding. And I change the initial onnx decoder input to [[-1,0],[-1,0]], there aren't any deletions errors in the beginning of a sentence, cer is 7.44. https://github.com/k2-fsa/icefall/blob/ed6bc200e37aaea0129ae32095642c096d4ffad5/egs/librispeech/ASR/zipformer/onnx_pretrained.py#L307

https://github.com/k2-fsa/icefall/blob/ed6bc200e37aaea0129ae32095642c096d4ffad5/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py#L589-L591

Can I know the reason? Thanks!

csukuangfj commented 1 month ago

I think it is an error in zipformer/onnx_pretrained.py to use [[0,0]] . A pull-request to fix it is welcome.