TypeError: expected Tensor as element 1 in argument 0, but got str

OswaldoBornemann commented 8 months ago

When i tried to run the following code using the punctuation model, it came out the error TypeError: expected Tensor as element 1 in argument 0, but got str.

from funasr import AutoModel

punc_model = AutoModel(model="ct-punc", model_revision="v2.0.4")

res = model.generate(input="那今天的会就到这里吧 happy new year 明年见")
print(res)

The full error is

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[56], line 5
      1 from funasr import AutoModel
      3 punc_model = AutoModel(model="ct-punc", model_revision="v2.0.4")
----> 5 res = model.generate(input="那今天的会就到这里吧 happy new year 明年见")
      6 print(res)

File ~/anaconda3/envs/tf/lib/python3.9/site-packages/funasr/auto/auto_model.py:206, in AutoModel.generate(self, input, input_len, **cfg)
    203     return self.inference(input, input_len=input_len, **cfg)
    205 else:
--> 206     return self.inference_with_vad(input, input_len=input_len, **cfg)

File ~/anaconda3/envs/tf/lib/python3.9/site-packages/funasr/auto/auto_model.py:270, in AutoModel.inference_with_vad(self, input, input_len, **cfg)
    268 self.vad_kwargs.update(cfg)
    269 beg_vad = time.time()
--> 270 res = self.inference(input, input_len=input_len, model=self.vad_model, kwargs=self.vad_kwargs, **cfg)
    271 end_vad = time.time()
    272 print(f"time cost vad: {end_vad - beg_vad:0.3f}")

File ~/anaconda3/envs/tf/lib/python3.9/site-packages/funasr/auto/auto_model.py:237, in AutoModel.inference(self, input, input_len, model, kwargs, key, **cfg)
    235 time1 = time.perf_counter()
    236 with torch.no_grad():
--> 237     results, meta_data = model.inference(**batch, **kwargs)
    238 time2 = time.perf_counter()
    240 asr_result_list.extend(results)

File ~/anaconda3/envs/tf/lib/python3.9/site-packages/funasr/models/fsmn_vad_streaming/model.py:592, in FsmnVADStreaming.inference(self, data_in, data_lengths, key, tokenizer, frontend, cache, **kwargs)
    589 meta_data["load_data"] = f"{time2 - time1:0.3f}"
    590 assert len(audio_sample_list) == 1, "batch_size must be set 1"
--> 592 audio_sample = torch.cat((cache["prev_samples"], audio_sample_list[0]))
    594 n = int(len(audio_sample) // chunk_stride_samples + int(_is_final))
    595 m = int(len(audio_sample) % chunk_stride_samples * (1 - int(_is_final)))

TypeError: expected Tensor as element 1 in argument 0, but got str

LauraGPT commented 8 months ago

ct-punc is punc model, but your error is vad model.

OswaldoBornemann commented 8 months ago

Yeah, that is the most weird thing. I do not change anything and clone the latest code of funASR. After converting the audio to 16k, the code ran successfully.

Chanli520 commented 6 months ago

ckpt: C:\Users\54574.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt 2024-04-13 22:52:56,965 - modelscope - INFO - Use user-specified model revision: v2.0.2 ckpt: C:\Users\54574.cache\modelscope\hub\iic\speech_campplus_sv_zh-cn_16k-common\campplus_cn_common.bin 2024-04-13 22:52:57,692 - modelscope - WARNING - No preprocessor field found in cfg. 2024-04-13 22:52:57,692 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2024-04-13 22:52:57,692 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\54574\.cache\modelscope\hub\iic\speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn'}. trying to build by task and model information. 2024-04-13 22:52:57,693 - modelscope - WARNING - No preprocessor key ('funasr', 'auto-speech-recognition') found in PREPROCESSOR_MAP, skip building preprocessor. 2024-04-13 22:52:57,696 - modelscope - INFO - cuda is not available, using cpu instead. 0%| | 0/1 [00:00<?, ?it/s]tensor([]) 测试 Traceback (most recent call last): File "D:\FunASR\demo.py", line 18, in rec_result = inference_pipeline(audio_in,batch_size_s=300, batch_size_token_threshold_s=40) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\Python311\Lib\site-packages\modelscope\pipelines\audio\funasr_pipeline.py", line 73, in call output = self.model(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\Python311\Lib\site-packages\modelscope\models\base\base_model.py", line 35, in call return self.postprocess(self.forward(*args, *kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\Python311\Lib\site-packages\modelscope\models\audio\funasr\model.py", line 61, in forward output = self.model.generate(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\FunASR\funasr\auto\auto_model.py", line 230, in generate return self.inference_with_vad(input, input_len=input_len, cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\FunASR\funasr\auto\auto_model.py", line 299, in inference_with_vad res = self.inference(input, input_len=input_len, model=self.vad_model, kwargs=self.vad_kwargs, cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\FunASR\funasr\auto\auto_model.py", line 263, in inference res = model.inference(batch, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\FunASR\funasr\models\fsmn_vad_streaming\model.py", line 599, in inference audio_sample = torch.cat((cache["prev_samples"], audio_sample_list[0])) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: expected Tensor as element 1 in argument 0, but got str 0%| | 0/1 [00:00<?, ?it/s] PS D:\FunASR> 我也遇到了，怎么解决的。

Chanli520 commented 6 months ago

Python 3.11.5 支持吗？

yang0 commented 5 months ago

我也是，直接跑例子就错了

(fun) G:\temp>funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=a.wav [2024-04-22 07:04:03,193][root][INFO] - download models from model hub: ms 2024-04-22 07:04:03,493 - modelscope - INFO - PyTorch version 2.2.2 Found. 2024-04-22 07:04:03,493 - modelscope - INFO - Loading ast index from C:\Users\yang0.cache\modelscope\ast_indexer 2024-04-22 07:04:03,576 - modelscope - INFO - Loading done! Current index file version is 1.13.3, with md5 6d626e81f17aa7d971d64a8780f635f1 and a total number of 972 components indexed 2024-04-22 07:04:04,025 - modelscope - WARNING - Using the master branch is fragile, please use it with caution! 2024-04-22 07:04:04,025 - modelscope - INFO - Use user-specified model revision: master [2024-04-22 07:04:05,349][root][INFO] - Loading pretrained params from C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt ckpt: C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt [2024-04-22 07:04:05,794][root][INFO] - Building VAD model. [2024-04-22 07:04:05,794][root][INFO] - download models from model hub: ms 2024-04-22 07:04:06,319 - modelscope - WARNING - Using the master branch is fragile, please use it with caution! 2024-04-22 07:04:06,320 - modelscope - INFO - Use user-specified model revision: master [2024-04-22 07:04:06,678][root][INFO] - Loading pretrained params from C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt ckpt: C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt [2024-04-22 07:04:06,680][root][INFO] - Building punc model. [2024-04-22 07:04:06,680][root][INFO] - download models from model hub: ms 2024-04-22 07:04:07,133 - modelscope - WARNING - Using the master branch is fragile, please use it with caution! 2024-04-22 07:04:07,133 - modelscope - INFO - Use user-specified model revision: master Building prefix dict from the default dictionary ... [2024-04-22 07:04:09,077][jieba][DEBUG] - Building prefix dict from the default dictionary ... Loading model from cache C:\Users\yang0\AppData\Local\Temp\jieba.cache [2024-04-22 07:04:09,095][jieba][DEBUG] - Loading model from cache C:\Users\yang0\AppData\Local\Temp\jieba.cache Loading model cost 0.532 seconds. [2024-04-22 07:04:09,627][jieba][DEBUG] - Loading model cost 0.532 seconds. Prefix dict has been built successfully. [2024-04-22 07:04:09,627][jieba][DEBUG] - Prefix dict has been built successfully. [2024-04-22 07:04:42,312][root][INFO] - Loading pretrained params from C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt ckpt: C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt 0%| | 0/1 [00:00<?, ?it/s]Error executing job with overrides: ['++model=paraformer-zh', '++vad_model=fsmn-vad', '++punc_model=ct-punc', '++input=a.wav'] Traceback (most recent call last): File "G:\devtools\anaconda3\envs\fun\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "G:\devtools\anaconda3\envs\fun\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "G:\devtools\anaconda3\envs\fun\Scripts\funasr.exe__main__.py", line 7, in File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\main.py", line 94, in decorated_main _run_hydra( File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py", line 394, in _run_hydra _run_app( File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py", line 457, in _run_app run_and_report( File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py", line 223, in run_and_report raise ex File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py", line 220, in run_andreport return func() File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py", line 458, in lambda: hydra.run( File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\hydra.py", line 132, in run = ret.return_value File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\core\utils.py", line 260, in return_value raise self._return_value File "G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\core\utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\bin\inference.py", line 26, in main_hydra res = model.generate(input=kwargs["input"]) File "G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py", line 232, in generate return self.inference_with_vad(input, input_len=input_len, cfg) File "G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py", line 301, in inference_with_vad res = self.inference(input, input_len=input_len, model=self.vad_model, kwargs=self.vad_kwargs, cfg) File "G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py", line 265, in inference res = model.inference(batch, kwargs) File "G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\models\fsmn_vad_streaming\model.py", line 599, in inference audio_sample = torch.cat((cache["prev_samples"], audio_sample_list[0])) TypeError: expected Tensor as element 1 in argument 0, but got str 0%| | 0/1 [00:00<?, ?it/s]

yang0 commented 5 months ago

我的环境：

(fun) G:\temp>conda info

 active environment : fun
active env location : G:\devtools\anaconda3\envs\fun
        shell level : 2
   user config file : C:\Users\yang0\.condarc

populated config files : C:\Users\yang0.condarc conda version : 23.7.4 conda-build version : 3.26.1 python version : 3.11.5.final.0 virtual packages : __archspec=1=x86_64 cuda=12.4=0 win=0=0 base environment : g:\devtools\anaconda3 (writable) conda av data dir : g:\devtools\anaconda3\etc\conda conda av metadata url : None channel URLs : https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/win-64 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/noarch https://repo.anaconda.com/pkgs/main/win-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/r/win-64 https://repo.anaconda.com/pkgs/r/noarch https://repo.anaconda.com/pkgs/msys2/win-64 https://repo.anaconda.com/pkgs/msys2/noarch package cache : g:\devtools\anaconda3\pkgs C:\Users\yang0.conda\pkgs C:\Users\yang0\AppData\Local\conda\conda\pkgs envs directories : G:\devtools\anaconda3\envs g:\devtools\anaconda3\envs C:\Users\yang0.conda\envs C:\Users\yang0\AppData\Local\conda\conda\envs platform : win-64 user-agent : conda/23.7.4 requests/2.31.0 CPython/3.11.5 Windows/10 Windows/10.0.22631 aau/0.4.2 c/MhaN_RiD1alBYT6u-FFVqQ s/xZuz-h3TUT_m2zOea5dIPQ e/k03uztLW2umqpLLFo8ut2A administrator : False netrc file : None offline mode : False

seanzhang-zhichen commented 4 months ago

sample problem on windows

loxs123 commented 4 months ago

我在想可能和这段代码有关系

# funasr/auto/auto_model.py
def prepare_data_iterator(data_in, input_len=None, data_type=None, key=None):
    """ """
    data_list = []
    key_list = []
    filelist = [".scp", ".txt", ".json", ".jsonl", ".text"]

    chars = string.ascii_letters + string.digits
    if isinstance(data_in, str) and data_in.startswith("http"):  # url
        data_in = download_from_url(data_in)

    if isinstance(data_in, str) and os.path.exists(
        data_in
    ):  # wav_path; filelist: wav.scp, file.jsonl;text.txt;
        _, file_extension = os.path.splitext(data_in)
        file_extension = file_extension.lower()
        if file_extension in filelist:  # filelist: wav.scp, file.jsonl;text.txt;
            with open(data_in, encoding="utf-8") as fin:
                for line in fin:
                    key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
                    if data_in.endswith(".jsonl"):  # file.jsonl: json.dumps({"source": data})
                        lines = json.loads(line.strip())
                        data = lines["source"]
                        key = data["key"] if "key" in data else key
                    else:  # filelist, wav.scp, text.txt: id \t data or data
                        lines = line.strip().split(maxsplit=1)
                        data = lines[1] if len(lines) > 1 else lines[0]
                        key = lines[0] if len(lines) > 1 else key

                    data_list.append(data)
                    key_list.append(key)
        else:
            if key is None:
                # key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
                key = misc.extract_filename_without_extension(data_in)
            data_list = [data_in]
            key_list = [key]
    elif isinstance(data_in, (list, tuple)):
        if data_type is not None and isinstance(data_type, (list, tuple)):  # mutiple inputs
            data_list_tmp = []
            for data_in_i, data_type_i in zip(data_in, data_type):
                key_list, data_list_i = prepare_data_iterator(
                    data_in=data_in_i, data_type=data_type_i
                )
                data_list_tmp.append(data_list_i)
            data_list = []
            for item in zip(*data_list_tmp):
                data_list.append(item)
        else:
            # [audio sample point, fbank, text]
            data_list = data_in
            key_list = []
            for data_i in data_in:
                if isinstance(data_i, str) and os.path.exists(data_i):
                    key = misc.extract_filename_without_extension(data_i)
                else:
                    key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
                key_list.append(key)

    else:  # raw text; audio sample point, fbank; bytes
        if isinstance(data_in, bytes):  # audio bytes
            data_in = load_bytes(data_in)
        if key is None:
            key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
        data_list = [data_in]
        key_list = [key]

    return key_list, data_list

Micous commented 3 months ago

我也是，我儿子在老家，

（有趣） G:\temp>funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=a.wav [2024-04-22 07:04:03,193][root][INFO] - 从模型中心下载模型：ms 2024-04-22 07:04:03,493 - modelscope - INFO - 找到 PyTorch 版本 2.2.2。 2024-04-22 07:04:03,493 - modelscope - INFO - 从 C:\Users\yang0.cache\modelscope\ast_indexer 加载 ast 索引 2024-04-22 07:04:03,576 - modelscope - INFO - 加载完成！当前索引文件版本为 1.13.3，md5 为6d626e81f17aa7d971d64a8780f635f1，共计 972 个组件被索引 2024-04-22 07:04:04,025 - modelscope - 警告 - 使用主分支很脆弱，请谨慎使用！ 2024-04-22 07:04:04,025 - modelscope - INFO - 使用用户指定的模型修订：master [2024-04-22 07:04:05,349][root][INFO] - 从 C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt加载预训练参数 ckpt：C:\Users\yang0.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt [2024-04-22 07:04:05,794][root][INFO] - 构建 VAD 模型。 [2024-04-22 07:04:05,794][root][INFO] - 从模型中心下载模型：ms 2024-04-22 07:04:06,319 - modelscope - 警告 - 使用主分支很脆弱，请谨慎使用！ 2024-04-22 07:04:06,320 - modelscope - INFO - 使用用户指定的模型修订：master [2024-04-22 07:04:06,678][root][INFO] - 从 C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt 加载预训练参数 ckpt：C:\Users\yang0.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt [2024-04-22 07:04:06,680][root][INFO] - 构建 punc 模型。 [2024-04-22 07:04:06,680][root][INFO] - 从模型中心下载模型：ms 2024-04-22 07:04:07,133 - modelscope - 警告 - 使用主分支很脆弱，请谨慎使用！ 2024-04-22 07:04:07,133 - modelscope - INFO - 使用用户指定的模型版本：master 从默认字典中构建前缀字典... [2024-04-22 07:04:09,077][jieba][DEBUG] - 从默认字典中构建前缀字典... 从缓存 C:\Users\yang0\AppData\Local\Temp\jieba.cache 加载模型 [2024-04-22 07:04:09,095][jieba][DEBUG] - 从缓存 C:\Users\yang0\AppData\Local\Temp\jieba.cache 加载模型加载耗时 0.532 秒。 [2024-04-22 07:04:09,627][jieba][DEBUG] - 加载模型耗时 0.532 秒。 [2024-04-22 07:04:09,627][jieba][DEBUG] - 前缀字典已成功构建。 [2024-04-22 07: 04:42,312][root][INFO] - 从 C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt 加载预训练参数 ckpt: C:\Users\yang0.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt 0%| | 0/1 [00:00<?, ?it/s]执行带有覆盖的作业时出错：['++model=paraformer-zh', '++vad_model=fsmn-vad', '++punc_model=ct-punc', '++input=a.wav'] 回溯（最近一次调用上次）：文件“G:\devtools\anaconda3\envs\fun\lib\runpy.py”，第 196 行，位于 _run_module_as_main 中 return _run_code(code, main_globals, None, 文件“G:\devtools\anaconda3\envs\fun\lib\runpy.py”，第 86 行，位于_run_code中 exec(code, runglobals) 文件“G:\devtools\anaconda3\envs\fun\Scripts\funasr.exe _main__ .py”，第 7 行，位于文件中“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\main.py”, 第 94 行, 在 decorated_main _run_hydra( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 394 行, 在 _run_hydra _run_app( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 457 行, 在 _run_app run_and_report( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 223 行，在 run_and_report raise ex 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 220 行，在 run_and_report return func() 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra_internal\utils.py”, 第 458 行，在 lambda: hydra.run( 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydrainternal\hydra.py”, 第 132 行，在 run = ret.return_value 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\core\utils.py”，第 260 行，在 return_value 中提高 self._return_value 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\hydra\core\utils.py”，第 186 行，在 run_job 中 ret.return_value = task_function(task_cfg) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\bin\inference.py”，第 26 行，在 main_hydra 中 res = model.generate(input=kwargs["input"]) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py”，第 232 行，在 generate return self.inference_with_vad(input, input_len=input_len, cfg) 文件“G：\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py”，第 301 行，在 inference_with_vad 中 res = self.inference(input, input_len=input_len, model=self.vad_model, kwargs=self.vad_kwargs, cfg) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\auto\auto_model.py”，第 265行，在推理中 res = model.inference(batch, kwargs) 文件“G:\devtools\anaconda3\envs\fun\lib\site-packages\funasr\models\fsmn_vad_streaming\model.py”，第 599 行，在推理中 audio_sample = torch.cat((cache["prev_samples"], audio_sample_list[0])) TypeError: 预期 Tensor 作为参数 0 中的元素 1，但得到 str 0%| | 0/1 [00:00<?, ?it/s]

window电脑的路径问题