Closed OswaldoBornemann closed 8 months ago
ct-punc
is punc model, but your error is vad model.
Yeah, that is the most weird thing. I do not change anything and clone the latest code of funASR. After converting the audio to 16k, the code ran successfully.
ckpt: C:\Users\54574.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt
2024-04-13 22:52:56,965 - modelscope - INFO - Use user-specified model revision: v2.0.2
ckpt: C:\Users\54574.cache\modelscope\hub\iic\speech_campplus_sv_zh-cn_16k-common\campplus_cn_common.bin
2024-04-13 22:52:57,692 - modelscope - WARNING - No preprocessor field found in cfg.
2024-04-13 22:52:57,692 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2024-04-13 22:52:57,692 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\54574\.cache\modelscope\hub\iic\speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn'}. trying to build by task and model information.
2024-04-13 22:52:57,693 - modelscope - WARNING - No preprocessor key ('funasr', 'auto-speech-recognition') found in PREPROCESSOR_MAP, skip building preprocessor.
2024-04-13 22:52:57,696 - modelscope - INFO - cuda is not available, using cpu instead.
0%| | 0/1 [00:00<?, ?it/s]tensor([]) 测试
Traceback (most recent call last): File "D:\FunASR\demo.py", line 18, in
Python 3.11.5 支持吗?
我也是,直接跑例子就错了
(fun) G:\temp>funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=a.wav
[2024-04-22 07:04:03,193][root][INFO] - download models from model hub: ms
2024-04-22 07:04:03,493 - modelscope - INFO - PyTorch version 2.2.2 Found.
2024-04-22 07:04:03,493 - modelscope - INFO - Loading ast index from C:\Users\yang0.cache\modelscope\ast_indexer
2024-04-22 07:04:03,576 - modelscope - INFO - Loading done! Current index file version is 1.13.3, with md5 6d626e81f17aa7d971d64a8780f635f1 and a total number of 972 components indexed
2024-04-22 07:04:04,025 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-04-22 07:04:04,025 - modelscope - INFO - Use user-specified model revision: master
[2024-04-22 07:04:05,349][root][INFO] - Loading pretrained params from C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt
ckpt: C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt
[2024-04-22 07:04:05,794][root][INFO] - Building VAD model.
[2024-04-22 07:04:05,794][root][INFO] - download models from model hub: ms
2024-04-22 07:04:06,319 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-04-22 07:04:06,320 - modelscope - INFO - Use user-specified model revision: master
[2024-04-22 07:04:06,678][root][INFO] - Loading pretrained params from C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt
ckpt: C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt
[2024-04-22 07:04:06,680][root][INFO] - Building punc model.
[2024-04-22 07:04:06,680][root][INFO] - download models from model hub: ms
2024-04-22 07:04:07,133 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-04-22 07:04:07,133 - modelscope - INFO - Use user-specified model revision: master
Building prefix dict from the default dictionary ...
[2024-04-22 07:04:09,077][jieba][DEBUG] - Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\yang0\AppData\Local\Temp\jieba.cache
[2024-04-22 07:04:09,095][jieba][DEBUG] - Loading model from cache C:\Users\yang0\AppData\Local\Temp\jieba.cache
Loading model cost 0.532 seconds.
[2024-04-22 07:04:09,627][jieba][DEBUG] - Loading model cost 0.532 seconds.
Prefix dict has been built successfully.
[2024-04-22 07:04:09,627][jieba][DEBUG] - Prefix dict has been built successfully.
[2024-04-22 07:04:42,312][root][INFO] - Loading pretrained params from C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt
ckpt: C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt
0%| | 0/1 [00:00<?, ?it/s]Error executing job with overrides: ['++model=paraformer-zh', '++vad_model=fsmn-vad', '++punc_model=ct-punc', '++input=a.wav']
Traceback (most recent call last):
File "G:\devtools\anaconda3\envs\fun\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "G:\devtools\anaconda3\envs\fun\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "G:\devtools\anaconda3\envs\fun\Scripts\funasr.exe__main__.py", line 7, in
我的环境:
(fun) G:\temp>conda info
active environment : fun
active env location : G:\devtools\anaconda3\envs\fun
shell level : 2
user config file : C:\Users\yang0\.condarc
populated config files : C:\Users\yang0.condarc conda version : 23.7.4 conda-build version : 3.26.1 python version : 3.11.5.final.0 virtual packages : __archspec=1=x86_64 cuda=12.4=0 win=0=0 base environment : g:\devtools\anaconda3 (writable) conda av data dir : g:\devtools\anaconda3\etc\conda conda av metadata url : None channel URLs : https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/win-64 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/noarch https://repo.anaconda.com/pkgs/main/win-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/r/win-64 https://repo.anaconda.com/pkgs/r/noarch https://repo.anaconda.com/pkgs/msys2/win-64 https://repo.anaconda.com/pkgs/msys2/noarch package cache : g:\devtools\anaconda3\pkgs C:\Users\yang0.conda\pkgs C:\Users\yang0\AppData\Local\conda\conda\pkgs envs directories : G:\devtools\anaconda3\envs g:\devtools\anaconda3\envs C:\Users\yang0.conda\envs C:\Users\yang0\AppData\Local\conda\conda\envs platform : win-64 user-agent : conda/23.7.4 requests/2.31.0 CPython/3.11.5 Windows/10 Windows/10.0.22631 aau/0.4.2 c/MhaN_RiD1alBYT6u-FFVqQ s/xZuz-h3TUT_m2zOea5dIPQ e/k03uztLW2umqpLLFo8ut2A administrator : False netrc file : None offline mode : False
sample problem on windows
我在想可能和这段代码有关系
# funasr/auto/auto_model.py
def prepare_data_iterator(data_in, input_len=None, data_type=None, key=None):
""" """
data_list = []
key_list = []
filelist = [".scp", ".txt", ".json", ".jsonl", ".text"]
chars = string.ascii_letters + string.digits
if isinstance(data_in, str) and data_in.startswith("http"): # url
data_in = download_from_url(data_in)
if isinstance(data_in, str) and os.path.exists(
data_in
): # wav_path; filelist: wav.scp, file.jsonl;text.txt;
_, file_extension = os.path.splitext(data_in)
file_extension = file_extension.lower()
if file_extension in filelist: # filelist: wav.scp, file.jsonl;text.txt;
with open(data_in, encoding="utf-8") as fin:
for line in fin:
key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
if data_in.endswith(".jsonl"): # file.jsonl: json.dumps({"source": data})
lines = json.loads(line.strip())
data = lines["source"]
key = data["key"] if "key" in data else key
else: # filelist, wav.scp, text.txt: id \t data or data
lines = line.strip().split(maxsplit=1)
data = lines[1] if len(lines) > 1 else lines[0]
key = lines[0] if len(lines) > 1 else key
data_list.append(data)
key_list.append(key)
else:
if key is None:
# key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
key = misc.extract_filename_without_extension(data_in)
data_list = [data_in]
key_list = [key]
elif isinstance(data_in, (list, tuple)):
if data_type is not None and isinstance(data_type, (list, tuple)): # mutiple inputs
data_list_tmp = []
for data_in_i, data_type_i in zip(data_in, data_type):
key_list, data_list_i = prepare_data_iterator(
data_in=data_in_i, data_type=data_type_i
)
data_list_tmp.append(data_list_i)
data_list = []
for item in zip(*data_list_tmp):
data_list.append(item)
else:
# [audio sample point, fbank, text]
data_list = data_in
key_list = []
for data_i in data_in:
if isinstance(data_i, str) and os.path.exists(data_i):
key = misc.extract_filename_without_extension(data_i)
else:
key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
key_list.append(key)
else: # raw text; audio sample point, fbank; bytes
if isinstance(data_in, bytes): # audio bytes
data_in = load_bytes(data_in)
if key is None:
key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
data_list = [data_in]
key_list = [key]
return key_list, data_list
我也是,我儿子在老家,
(有趣) G:\temp>funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=a.wav [2024-04-22 07:04:03,193][root][INFO] - 从模型中心下载模型:ms 2024-04-22 07:04:03,493 - modelscope - INFO - 找到 PyTorch 版本 2.2.2。 2024-04-22 07:04:03,493 - modelscope - INFO - 从 C:\Users\yang0.cache\modelscope\ast_indexer 加载 ast 索引 2024-04-22 07:04:03,576 - modelscope - INFO - 加载完成!当前索引文件版本为 1.13.3,md5 为6d626e81f17aa7d971d64a8780f635f1,共计 972 个组件被索引 2024-04-22 07:04:04,025 - modelscope - 警告 - 使用主分支很脆弱,请谨慎使用! 2024-04-22 07:04:04,025 - modelscope - INFO - 使用用户指定的模型修订:master [2024-04-22 07:04:05,349][root][INFO] - 从 C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt加载预训练参数 ckpt:C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt [2024-04-22 07:04:05,794][root][INFO] - 构建 VAD 模型。 [2024-04-22 07:04:05,794][root][INFO] - 从模型中心下载模型:ms 2024-04-22 07:04:06,319 - modelscope - 警告 - 使用主分支很脆弱,请谨慎使用! 2024-04-22 07:04:06,320 - modelscope - INFO - 使用用户指定的模型修订:master [2024-04-22 07:04:06,678][root][INFO] - 从 C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt 加载预训练参数 ckpt:C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt [2024-04-22 07:04:06,680][root][INFO] - 构建 punc 模型。 [2024-04-22 07:04:06,680][root][INFO] - 从模型中心下载模型:ms 2024-04-22 07:04:07,133 - modelscope - 警告 - 使用主分支很脆弱,请谨慎使用! 2024-04-22 07:04:07,133 - modelscope - INFO - 使用用户指定的模型版本:master 从默认字典中构建前缀字典... [2024-04-22 07:04:09,077][jieba][DEBUG] - 从默认字典中构建前缀字典... 从缓存 C:\Users\yang0\AppData\Local\Temp\jieba.cache 加载模型 [2024-04-22 07:04:09,095][jieba][DEBUG] - 从缓存 C:\Users\yang0\AppData\Local\Temp\jieba.cache 加载模型加载 耗时 0.532 秒。 [2024-04-22 07:04:09,627][jieba][DEBUG] - 加载模型耗时 0.532 秒 。 [2024-04-22 07:04:09,627][jieba][DEBUG] - 前缀字典已成功构建。 [2024-04-22 07: 04:42,312][root][INFO] - 从 C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt 加载预训练参数 ckpt: C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt 0%| | 0/1 [00:00<?, ?it/s]执行带有覆盖的作业时出错:['++model=paraformer-zh', '++vad_model=fsmn-vad', '++punc_model=ct-punc', '++input=a.wav'] 回溯(最近一次调用上次): 文件“G:\devtools\anaconda3\envs\fun\lib\runpy.py”,第 196 行,位于 _run_module_as_main 中 return _run_code(code, main_globals, None, 文件“G:\devtools\anaconda3\envs\fun\lib\runpy.py”,第 86 行,位于_run_code中 exec(code, runglobals) 文件“G:\devtools\anaconda3\envs\fun\Scripts\funasr.exe _main__ .py”,第 7 行,位于 文件中“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\main.py”, 第 94 行, 在 decorated_main _run_hydra( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 394 行, 在 _run_hydra _run_app( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 457 行, 在 _run_app run_and_report( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 223 行,在 run_and_report raise ex 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 220 行,在 run_and_report return func() 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 458 行,在 lambda: hydra.run( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydrainternal\hydra.py”, 第 132 行,在 run = ret.return_value 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\core\utils.py”,第 260 行,在 return_value 中 提高 self._return_value 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\core\utils.py”,第 186 行,在 run_job 中 ret.return_value = task_function(task_cfg) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\bin\inference.py”,第 26 行,在 main_hydra 中 res = model.generate(input=kwargs["input"]) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py”,第 232 行,在 generate return self.inference_with_vad(input, input_len=input_len, cfg) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py”,第 301 行,在 inference_with_vad 中 res = self.inference(input, input_len=input_len, model=self.vad_model, kwargs=self.vad_kwargs, cfg) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py”,第 265行,在推理中 res = model.inference(batch, kwargs) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\models\fsmn_vad_streaming\model.py”,第 599 行,在推理中 audio_sample = torch.cat((cache["prev_samples"], audio_sample_list[0])) TypeError: 预期 Tensor 作为参数 0 中的元素 1,但得到 str 0%| | 0/1 [00:00<?, ?it/s]
window电脑的路径问题
我也遇到了这个问题,debug了半天,发现是wav文件名里有空格...
我也遇到了这个问题,debug了半天,发现是wav文件名里有空格...
我之前也是,写脚本的时候得用双引号括起来才可以
我也遇到了这个问题,debug了半天,发现是wav文件名里有空格...
我之前也是,写脚本的时候得用双引号括起来才可以
难道你不加双引号的么?
我也遇到了这个问题,debug了半天,发现是wav文件名里有空格...
我之前也是,写脚本的时候得用双引号括起来才可以
难道你不加双引号的么?
不加双引号的时候会出现空格的那个问题,加完就好了
发现是文件路径不对造成的,检查音频文件路径
When i tried to run the following code using the punctuation model, it came out the error
TypeError: expected Tensor as element 1 in argument 0, but got str
.The full error is