modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
5.81k stars 629 forks source link

speech_fsmn_vad_zh-cn-16k-common-pytorch使用范例测试报错 #1373

Closed zhaozhaodf closed 6 months ago

zhaozhaodf commented 6 months ago

🐛 Bug

按照Paraformer分角色语音识别-中文-通用 中使用iic/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

1.使用范例 from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks

if name == 'main': audio_in = 'https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_speaker_demo.wav' output_dir = "./results" inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='iic/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn', model_revision='v2.0.4', vad_model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", punc_model='iic/punc_ct-transformer_cn-en-common-vocab471067-large', punc_model_revision="v2.0.4", output_dir=output_dir, ) rec_result = inference_pipeline(audio_in, batch_size_s=300, batch_size_token_threshold_s=40) print(rec_result)

  1. See error

DEBUG:jieba:Prefix dict has been built successfully. /home/gmt/miniconda3/envs/funasr/lib/python3.8/site-packages/sklearn/cluster/_kmeans.py:1416: FutureWarning: The default value of n_init will change from 10 to 'auto' in 1.4. Set the value of n_init explicitly to suppress the warning super()._check_params_vs_input(X, default_n_init=10) Traceback (most recent call last): File "t2.py", line 15, in rec_result = inference_pipeline(audio_in) File "/home/gmt/miniconda3/envs/funasr/lib/python3.8/site-packages/modelscope/pipelines/audio/funasr_pipeline.py", line 73, in call output = self.model(*args, kwargs) File "/home/gmt/miniconda3/envs/funasr/lib/python3.8/site-packages/modelscope/models/base/base_model.py", line 35, in call return self.postprocess(self.forward(*args, *kwargs)) File "/home/gmt/miniconda3/envs/funasr/lib/python3.8/site-packages/modelscope/models/audio/funasr/model.py", line 61, in forward output = self.model.generate(args, kwargs) File "/home/gmt/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 206, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) File "/home/gmt/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 407, in inference_with_vad result['raw_text']) KeyError: 'raw_text'

Code sample

Expected behavior

Environment

Additional context

LauraGPT commented 6 months ago

Thank you for your feedback, the bug has been fixed. Please update funasr, and try it again.

R1ckShi commented 6 months ago

please use funasr automodel instead of modelscope pipeline currently run model as

from funasr import AutoModel
# paraformer-zh is a multi-functional asr model
# use vad, punc, spk or not as you need
model = AutoModel(model="paraformer-zh", model_revision="v2.0.4",
                  vad_model="fsmn-vad", vad_model_revision="v2.0.4",
                  punc_model="ct-punc-c", punc_model_revision="v2.0.4",
                  # spk_model="cam++", spk_model_revision="v2.0.2",
                  )
res = model.generate(input=f"{model.model_path}/example/asr_example.wav", 
                     batch_size_s=300, 
                     hotword='魔搭')
print(res)