modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.77k stars 717 forks source link

Failed to run funasr-onnx-offline demo. #425

Closed Darlig closed 1 year ago

Darlig commented 1 year ago

OS: Ubuntu 18.04 Python: 3.8.15, C++: 7.5.0 Package Version: torch 2.0.0 modelscope 1.5.2 funasr 0.4.3 (onnx 1.13.1 onnxruntime 1.14.1) Model: speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch Command: bin/funasr-onnx-offline export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/ asr_vad_punc_example.wav false false false 问题描述:按照readme编译onnxruntime后,运行demo验证,遇到报错。(使用音频文件为16kHz, 16bit PCM, 约13s的文件) Error log: Model initialization takes 1.611325s. 2023-04-26 16:22:01.185645992 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/encoder/encoders0/encoders0.0/self_attn/fsmn_block/Conv' Status Message: Invalid input shape: {10} Non-zero status code returned while running Conv node. Name:'/encoder/encoders0/encoders0.0/self_attn/fsmn_block/Conv' Status Message: Invalid input shape: {10}Result:
Audio length 0.000813s. Model inference takes 0.000510s. Model inference RTF: 0.627692.

lyblsgo commented 1 year ago

拉下最新的代码试试,欢迎加入钉钉群:群号 27215013275

Darlig commented 1 year ago

拉取了最新代码(commit: 5dcfaeec14909eb0bd739f7c1b2dddd6f1b84f02) ,还是报错。换了其他音频文件,或者把原来的出错的音频文件使用sox截取0-10s片段,都可以正常识别。出错的情况如下:

Command: bin/funasr-onnx-offline --am-config ../export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/config.yaml --am-cmvn ../export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/am.mvn --am-model ../export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.onnx --wav-path ../testdata/asr_vad_punc_example.wav

Error log: I20230427 16:57:03.447063 9930 funasr-onnx-offline.cpp:27] am-model : ../export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.onnx I20230427 16:57:03.447172 9930 funasr-onnx-offline.cpp:27] am-cmvn : ../export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/am.mvn I20230427 16:57:03.447187 9930 funasr-onnx-offline.cpp:27] am-config : ../export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/config.yaml I20230427 16:57:03.447199 9930 funasr-onnx-offline.cpp:27] wav-path : ../testdata/asr_vad_punc_example.wav I20230427 16:57:05.054219 9930 funasr-onnx-offline.cpp:90] Model initialization takes 1.60701 s 2023-04-27 16:57:05.054703596 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/encoder/encoders0/encoders0.0/self_attn/fsmn_block/Conv' Status Message: Invalid input shape: {10} Non-zero status code returned while running Conv node. Name:'/encoder/encoders0/encoders0.0/self_attn/fsmn_block/Conv' Status Message: Invalid input shape: {10}Result: I20230427 16:57:05.054764 9930 funasr-onnx-offline.cpp:138] Audio length: 0.0008125 s I20230427 16:57:05.054774 9930 funasr-onnx-offline.cpp:139] Model inference takes: 0.000513 s I20230427 16:57:05.054780 9930 funasr-onnx-offline.cpp:140] Model inference RTF: 0.631385

lanlli commented 1 year ago

请问这个问题解决了吗,我也遇到了这个问题

lyblsgo commented 1 year ago

@Darlig @lanlli 已解决,拉下最新的代码