modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.47k stars 688 forks source link

ASR无法加载两个不同config的offline,online模型 #1286

Closed chenyangMl closed 4 months ago

chenyangMl commented 8 months ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. convert Paraformer语音识别-中文-通用-16k-离线-large 为onnx模型.使用如下代码,量化为int8模型
from funasr_onnx import Paraformer
from pathlib import Path

model_dir = "damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1"
model = Paraformer(model_dir, batch_size=1, quantize=True)

wav_path = ['{}/.cache/modelscope/hub/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/example/asr_example.wav'.format(Path.home())]

result = model(wav_path)
print(result)
  1. 使用新的offline模型和online模型一起,启动服务。 命令:

    bash run_server_2pass.sh 
    --download-model-dir /workspace/models \
    --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
    --model-dir damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1 \
    --online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \
    --punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \
    --itn-dir thuduj12/fst_itn_zh \
  2. 测试2pass流程, offline输出结果错误。 测试命令 ./funasr-wss-client-2pass --server-ip 127.0.0.1 --port 10095 --mode 2pass --wav-path path_to/audio/asr_example.wav --is-ssl 0

错误日志

I20240123 16:43:00.020056 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"欢迎","wav_name":"wav_default_id"}
I20240123 16:43:00.103735 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"大家来","wav_name":"wav_default_id"}
I20240123 16:43:00.189242 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"体验达","wav_name":"wav_default_id"}
I20240123 16:43:00.273144 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"摩院推","wav_name":"wav_default_id"}
I20240123 16:43:00.356848 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"出的","wav_name":"wav_default_id"}
I20240123 16:43:00.440845 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"语音识","wav_name":"wav_default_id"}
I20240123 16:43:00.524246 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":false,"mode":"2pass-online","text":"别模","wav_name":"wav_default_id"}
I20240123 16:43:00.740365 17773 funasr-wss-client-2pass.cpp:108] Thread: 140381783238400,on_message = {"is_final":true,"mode":"2pass-offline","text":"轨avenue瑮疚慰父,痤逮work,蚂sha冫,鳌困going退,et think眦。","wav_name":"wav_default_id"}

错误表现: online模型解码正确,offline模型解码错误。

Code sample

Expected behavior

感觉应该是offline模型的tokenizer内部加载错误了,导致offline阶段的识别过程乱码了。

希望可以支持加载不同config, 不同tokenizer的模型可以组合使用。

Environment

Additional context

lyblsgo commented 8 months ago

Currently, the software package supports consistent configurations for both offline and online models, and does not support loading two models with different configurations for offline and online use. Whether we need to support this will be evaluated later.

chenyangMl commented 8 months ago

Thanks for your response. Although the accuracy of the small model is lower, it has fast processing speed, so it is necessary to have two models with different configurations. That will be a nice feature for many of users.