Closed MarchLiu closed 2 months ago
Please check you wav file.
Please check you wav file.
这个wav文件可以用以下的python脚本正常处理:
import sys
from funasr import AutoModel
model_dir = "iic/SenseVoiceSmall"
model = AutoModel(
model=model_dir,
vad_model="fsmn-vad"
)
res = model.generate(
input=sys.argv[1],
cache={},
language="auto", # "zn", "en", "yue", "ja", "ko", "nospeech"
use_itn=True,
batch_size_s=60,
merge_vad=True, #
merge_length_s=15,
)
if len(sys.argv) >= 3:
with open(sys.argv[2], "w+") as f:
f.write(res[0]["text"])
else:
print(res[0]["text"])
Same problem, any solution?
docker pull modelscope-registry.cn-hangzhou.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.1.0-py310-torch2.3.0-tf2.16.1-1.18.0
Then inside the container, run
funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=asr_example_zh.wav
Throw error
funasr version: 1.1.6.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
You are using the latest version of funasr-1.1.6
[2024-09-11 17:56:09,280][root][INFO] - download models from model hub: ms
2024-09-11 17:56:10,420 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
[2024-09-11 17:56:13,379][root][INFO] - Loading pretrained params from /mnt/workspace/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
[2024-09-11 17:56:13,386][root][INFO] - ckpt: /mnt/workspace/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
[2024-09-11 17:56:13,926][root][INFO] - scope_map: ['module.', 'None']
[2024-09-11 17:56:13,926][root][INFO] - excludes: None
[2024-09-11 17:56:14,073][root][INFO] - Loading ckpt: /mnt/workspace/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt, status: <All keys matched successfully>
[2024-09-11 17:56:14,085][root][INFO] - Building VAD model.
[2024-09-11 17:56:14,086][root][INFO] - download models from model hub: ms
2024-09-11 17:56:14,371 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
[2024-09-11 17:56:14,754][root][INFO] - Loading pretrained params from /mnt/workspace/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
[2024-09-11 17:56:14,755][root][INFO] - ckpt: /mnt/workspace/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
[2024-09-11 17:56:14,759][root][INFO] - scope_map: ['module.', 'None']
[2024-09-11 17:56:14,759][root][INFO] - excludes: None
[2024-09-11 17:56:14,761][root][INFO] - Loading ckpt: /mnt/workspace/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt, status: <All keys matched successfully>
[2024-09-11 17:56:14,762][root][INFO] - Building punc model.
[2024-09-11 17:56:14,762][root][INFO] - download models from model hub: ms
2024-09-11 17:56:15,163 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
Building prefix dict from the default dictionary ...
[2024-09-11 17:56:18,147][jieba][DEBUG] - Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
[2024-09-11 17:56:18,147][jieba][DEBUG] - Loading model from cache /tmp/jieba.cache
Loading model cost 0.806 seconds.
[2024-09-11 17:56:18,953][jieba][DEBUG] - Loading model cost 0.806 seconds.
Prefix dict has been built successfully.
[2024-09-11 17:56:18,953][jieba][DEBUG] - Prefix dict has been built successfully.
[2024-09-11 17:56:45,639][root][INFO] - Loading pretrained params from /mnt/workspace/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
[2024-09-11 17:56:45,640][root][INFO] - ckpt: /mnt/workspace/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
[2024-09-11 17:56:46,113][root][INFO] - scope_map: ['module.', 'None']
[2024-09-11 17:56:46,113][root][INFO] - excludes: None
[2024-09-11 17:56:46,246][root][INFO] - Loading ckpt: /mnt/workspace/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt, status: <All keys matched successfully>
0%| | 0/1 [00:00<?, ?it/s]Error executing job with overrides: ['++model=paraformer-zh', '++vad_model=fsmn-vad', '++punc_model=ct-punc', '++input=asr_example_zh.wav']
Traceback (most recent call last):
File "/usr/local/bin/funasr", line 8, in <module>
sys.exit(main_hydra())
File "/usr/local/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/usr/local/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/usr/local/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/usr/local/lib/python3.10/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
raise ex
File "/usr/local/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
File "/usr/local/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
lambda: hydra.run(
File "/usr/local/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/usr/local/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/usr/local/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "/usr/local/lib/python3.10/site-packages/funasr/bin/inference.py", line 25, in main_hydra
res = model.generate(input=kwargs["input"])
File "/usr/local/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 263, in generate
return self.inference_with_vad(input, input_len=input_len, **cfg)
File "/usr/local/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 336, in inference_with_vad
res = self.inference(
File "/usr/local/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 302, in inference
res = model.inference(**batch, **kwargs)
File "/usr/local/lib/python3.10/site-packages/funasr/models/fsmn_vad_streaming/model.py", line 690, in inference
audio_sample = torch.cat((cache["prev_samples"], audio_sample_list[0]))
TypeError: expected Tensor as element 1 in argument 0, but got str
0%| | 0/1 [00:00<?, ?it/s]
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
🐛 Bug
To Reproduce
Steps to reproduce the behavior (always include the command you ran):
Code sample
上例的命令行来自项目 readme 中的例子,使用了本地的一个wav录音
Expected behavior
Environment
pip
, source): pipAdditional context