modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.06k stars 649 forks source link

Service websocket-cpp 使用wav文件 无法返回asr结果,谢谢! #597

Closed mdys closed 1 year ago

mdys commented 1 year ago

大佬,感谢指点。

OS:centos7.9
python==3.7.16
torch==1.11.0+cu102
modelscope-1.6.1
funasr-0.5.7
onnxruntime
g++ (GCC) 11.2.1
cmake version 3.26.3
模型:speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch

我按照文档 安装了 websocketmain websocketclient ,websocketmain 运行正常 ,但client连接 传wav录音文件,无法返回asr结果 server 端

/home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true --vad-dir /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --vad-quant true --punc-dir /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch --punc-quant true --port 8889

I20230606 17:08:40.692225  8907 websocketmain.cpp:20] model-dir : /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
I20230606 17:08:40.692301  8907 websocketmain.cpp:20] quantize : true
I20230606 17:08:40.692307  8907 websocketmain.cpp:20] vad-dir : /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch
I20230606 17:08:40.692312  8907 websocketmain.cpp:20] vad-quant : true
I20230606 17:08:40.692315  8907 websocketmain.cpp:20] punc-dir : /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch
I20230606 17:08:40.692319  8907 websocketmain.cpp:20] punc-quant : true
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20230606 17:08:40.713418  8907 fsmn-vad.cpp:58] Successfully load model from /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/model_quant.onnx
model ready
asr model init finished. listen on port:8889

client

/home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 /home/FunASR/11-16-left.wav 1 0
wait..0
sended data len=293484
[2023-06-06 17:51:37] [error] handle_read_frame error: websocketpp.transport:7 (End of File)

py client

python /home/FunASR/funasr/runtime/python/websocket/wss_client_asr.py --host "127.0.0.1" --port 8889 --mode offline --chunk_interval 10 --words_max_print 100 --audio_in "/home/FunASR/11-16-left.wav" --send_without_sleep --output_dir "./results" --ssl 0
Namespace(audio_in='/home/FunASR/11-16.scp', chunk_interval=10, chunk_size=[5, 10, 5], host='127.0.0.1', mode='offline', output_dir='./results', port=8889, send_without_sleep=True, ssl=0, test_thread_num=1, words_max_print=100)
connect to ws://127.0.0.1:8889
started to sending data!

另外发现一个问题,websocketclient 若传的wav文件是一个文件大小正常,但没有音频波形内容的wav文件时候,websocketmain 服务也崩溃了,发送coredump

asr model init finished. listen on port:8889
on_open, active connections: 1
Segmentation fault (core dumped)
zhaomingwork commented 1 year ago

测试文件方便发下么?

mdys commented 1 year ago

不好意思 ,我刚看到回复,麻烦下载下文件改后缀名 =.wav .
感谢大佬 11-16-left 11-16-right

zhaomingwork commented 1 year ago

@langgz 发现runtime加上punc_ct-transformer_zh-cn-common-vocab272727-pytorch模型,就会报这样一个错误。 2023-06-07 02:33:11.896812387 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running Sub node. Name:'/encoder/Sub_1' Status Message: /onnxruntime_src/include/onnxruntime/core/framework/op_kernel_context.h:42 const T* onnxruntime::OpKernelContext::Input(int) const [with T = onnxruntime::Tensor] Missing Input: sub_masks

E20230607 02:33:11.896888 10754 ct-transformer.cpp:184] Error when run punc onnx forword: Non-zero status code returned while running Sub node. Name:'/encoder/Sub_1' Status Message: /onnxruntime_src/include/onnxruntime/core/framework/op_kernel_context.h:42 const T* onnxruntime::OpKernelContext::Input(int) const [with T = onnxruntime::Tensor] Missing Input: sub_masks

mdys commented 1 year ago

感谢大佬协助解决,辛苦再给看看这个问题

zhaomingwork commented 1 year ago

还在排查,你先试试只加载asr模型,vad,punc都别加,看看可以返回结果么?命令如下: /home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true

mdys commented 1 year ago

感谢帮助,刚试了下 。直接崩溃了

‘’‘ /home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true I20230607 15:59:13.278146 4923 websocketmain.cpp:20] model-dir : /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch I20230607 15:59:13.278230 4923 websocketmain.cpp:20] quantize : true model ready asr model init finished. listen on port:8889 on_open, active connections: 1 Segmentation fault (core dumped)

/home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 /home/FunASR/11-16-right.wav 1 0 wait..0 sended data len=293484 [2023-06-07 15:59:30] [error] handle_read_frame error: asio.system:104 (Connection reset by peer)

‘’‘

zhaomingwork commented 1 year ago

@mdys 已经更新了。PR #610

mdys commented 1 year ago

大佬您好,@zhaomingwork ,经过测试 还是会崩溃,不能返回结果。 麻烦能给再看看么,不尽感激。

/home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --port 8889 --io_thread_num 2
I20230608 20:28:53.535951  7779 websocketmain.cpp:20] model-dir : /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
model ready
asr model init finished. listen on port:8889
on_open, active connections: 1
client done
Segmentation fault (core dumped)

/home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 /home/FunASR/asr_example_zh.wav 1 0
wait..0
sended data len=177572
[2023-06-08 20:29:05] [error] handle_read_frame error: websocketpp.transport:7 (End of File)

asr_example_zh.wav 地址是 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav

zhaomingwork commented 1 year ago

@mdys 可以试试换个新系统重装一次,看看还是不是这样

mdys commented 1 year ago

@zhaomingwork 大佬你好,我重装了一台相同配置的centos7 ,仍然是不行的,还是崩溃

/home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 /home/FunASR/asr_example_zh.wav 1 0
wait..0
sended data len=177572
[2023-06-09 01:33:19] [error] handle_read_frame error: websocketpp.transport:7 (End of File)

我实在找不到什么问题了。。您能发下您所用过的环境版本吗? 谢谢了!

OS: centos 7.9  3.10.0-1160.90.1.el7.x86_64
Python 3.7.16
funasr_onnx 0.1.0
onnx 1.12.0
modelscope 1.6.1
onnxruntime 1.14.1
funasr 0.5.8
zhaomingwork commented 1 year ago

@mdys 是从我的pr下载的么?需要切下分支

mdys commented 1 year ago

你好,我是重新 git clone https://github.com/alibaba-damo-academy/FunASR.git 这样做的 是不是搞错了? -_-". 另外我测了下websocketclient 传一个 稍大的文件 比如1MB ,就直接 Segmentation fault (core dumped) 了。 还望大佬再给看看 。 请问这个websocket 支持别的客户端语言连接吗,比如网页上用js搞的。

zhaomingwork commented 1 year ago

@mdys 这样应该可以。主要没法复现,只能你自己定位下问题在哪。项目里html5就是js连接的

mdys commented 1 year ago

@zhaomingwork 我重新下载了您的分支 git clone https://github.com/zhaomingwork/FunASR.git ,倒是不直接 Segmentation fault (core dumped)了。但 client 依然没有数据出来,您能发下您那边可用的 编译好的 websocketmain 和 websocketclient吗?我实在是搞不懂哪里问题了。。 我的操作


git clone https://github.com/zhaomingwork/FunASR.git
pip install -e ./

python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize True python -m funasr.export.export_model --model-name damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --export-dir ./export --type onnx --quantize True python -m funasr.export.export_model --model-name damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch --export-dir ./export --type onnx --quantize True

cd funasr/runtime/websocket mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/home/FunASR/onnxruntime-linux-x64-1.14.0 make

/home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true --vad-dir /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --vad-quant true --punc-dir /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch --punc-quant true --port 8889 I20230609 14:50:27.330942 17134 websocketmain.cpp:20] model-dir : /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch I20230609 14:50:27.331010 17134 websocketmain.cpp:20] quantize : true I20230609 14:50:27.331015 17134 websocketmain.cpp:20] vad-dir : /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch I20230609 14:50:27.331020 17134 websocketmain.cpp:20] vad-quant : true I20230609 14:50:27.331023 17134 websocketmain.cpp:20] punc-dir : /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch I20230609 14:50:27.331027 17134 websocketmain.cpp:20] punc-quant : true WARNING: Logging before InitGoogleLogging() is written to STDERR I20230609 14:50:27.350993 17134 fsmn-vad.cpp:58] Successfully load model from /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/model_quant.onnx model ready asr model init finished. listen on port:8889 on_open, active connections: 1 client done [2023-06-09 14:51:25] [error] handle_read_frame error: websocketpp.transport:7 (End of File) on_close, active connections: 0 /home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 /home/FunASR/asr_example_zh.wav 1 0 wait..0 sended data len=177572 ##一直没有结果返回 无比感谢。

mdys commented 1 year ago

@zhaomingwork 我重新下载了您的分支 git clone https://github.com/zhaomingwork/FunASR.git ,倒是不直接 Segmentation fault (core dumped)了。但 client 依然没有数据出来,您能发下您那边可用的 编译好的 websocketmain 和 websocketclient吗?我实在是搞不懂哪里问题了。。 我的操作


git clone https://github.com/zhaomingwork/FunASR.git
pip install -e ./

python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize True python -m funasr.export.export_model --model-name damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --export-dir ./export --type onnx --quantize True python -m funasr.export.export_model --model-name damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch --export-dir ./export --type onnx --quantize True

cd funasr/runtime/websocket mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/home/FunASR/onnxruntime-linux-x64-1.14.0 make

/home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true --vad-dir /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --vad-quant true --punc-dir /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab ![微信图片_20230609143931](https://github.com/alibaba-damo-academy/FunASR/assets/1708120/157000e5-2d41-4c63-a9f8-58d8e44fbcc7) 272727-pytorch --punc-quant true --port 8889 I20230609 14:50:27.330942 17134 websocketmain.cpp:20] model-dir : /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch I20230609 14:50:27.331010 17134 websocketmain.cpp:20] quantize : true I20230609 14:50:27.331015 17134 websocketmain.cpp:20] vad-dir : /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch I20230609 14:50:27.331020 17134 websocketmain.cpp:20] vad-quant : true I20230609 14:50:27.331023 17134 websocketmain.cpp:20] punc-dir : /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch I20230609 14:50:27.331027 17134 websocketmain.cpp:20] punc-quant : true WARNING: Logging before InitGoogleLogging() is written to STDERR I20230609 14:50:27.350993 17134 fsmn-vad.cpp:58] Successfully load model from /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/model_quant.onnx model ready asr model init finished. listen on port:8889 on_open, active connections: 1 client done [2023-06-09 14:51:25] [error] handle_read_frame error: websocketpp.transport:7 (End of File) on_close, active connections: 0 /home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 /home/FunASR/asr_example_zh.wav 1 0 wait..0 sended data len=177572 ##一直没有结果返回 无比感谢。


![微信图片_20230609143931](https://github.com/alibaba-damo-academy/FunASR/assets/1708120/1823a296-a0de-4f1a-9ec2-eae49fbc741f)
zhaomingwork commented 1 year ago

websocketmain @mdys pr的话,还需要git checkout 对应分支

mdys commented 1 year ago

@zhaomingwork T_T终于见到效果了。。发现是 io_thread_num 参数问题。我原先是默认没有填,看文档默认是8. 我改成1 就出结果了,websocketclient 这个进程数量也只能设为1才行。 我想请问下这个可以支持多并发吗? 我的服务器是32核 64线程,我该如何填参数做并发呢? websocketclient 怎么做出结果后就退出,发送新文件给server啊 ?感谢指点迷津!

/home/FunASR/funasr/runtime/websocket/build/bin/websocketmain --model-dir /home/FunASR/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --quantize true --vad-dir /home/FunASR/export/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --vad-quant true --punc-dir /home/FunASR/export/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch --punc-quant true --port 8889 --decoder_thread_num 4 --io_thread_num 1
buffer.size=182372,result json={"mode":"offline","text":"欢迎大家来体验达摩院推出的语音识别模型。","wav_name":"damo"}
[2023-06-09 22:52:26] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
on_close, active connections: 0
/home/FunASR/funasr/runtime/websocket/build/bin/websocketclient 127.0.0.1 8889 ./asr_example_zh.wav 1 0
wait..0
sended data len=177572
on_message={"mode":"offline","text":"欢迎大家来体验达摩院推出的语音识别模型。","wav_name":"damo"}
zhaomingwork commented 1 year ago

@mdys 这个现象有点像没更新代码。你用我上面发的websocketmain试试。后面问题你自己改client循环调用

mdys commented 1 year ago

@zhaomingwork 大佬你好,遇到个新问题,我需要用php做个websocket客户端 读取wav,给server,不知道怎么传数据,麻烦能给解惑下吗? 不尽感激! 提了个新的 issues https://github.com/alibaba-damo-academy/FunASR/issues/625