Closed zxzxde closed 10 months ago
speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch这个模型不支持预测时间戳,可以使用
from funasr import AutoModel
# paraformer-zh is a multi-functional asr model
# use vad, punc, spk or not as you need
model = AutoModel(model="paraformer-zh", model_revision="v2.0.2",
vad_model="fsmn-vad", vad_model_revision="v2.0.2",
punc_model="ct-punc-c", punc_model_revision="v2.0.3",
# spk_model="cam++", spk_model_revision="v2.0.2",
)
res = model.generate(input=f"{model.model_path}/example/asr_example.wav",
batch_size_s=300,
hotword='魔搭')
print(res)
paraformer-zh模型对应modelscope上的https://modelscope.cn/models/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary
batch_size_s
运行这个模型报这个错误: requests.exceptions.HTTPError: Response details: {'Code': 10010205001, 'Message': '获取模型信息失败,信息:record not found', 'RequestId': '94ad2cf0-ab1e-4b63-a26c-ac535e554a89', 'Success': False}, Request id: ef7cfb84243c47508385c58c13860869
好了,3Q
voice_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch该模型不支持预测时间,可以使用
from funasr import AutoModel # paraformer-zh is a multi-functional asr model # use vad, punc, spk or not as you need model = AutoModel(model="paraformer-zh", model_revision="v2.0.2", vad_model="fsmn-vad", vad_model_revision="v2.0.2", punc_model="ct-punc-c", punc_model_revision="v2.0.3", # spk_model="cam++", spk_model_revision="v2.0.2", ) res = model.generate(input=f"{model.model_path}/example/asr_example.wav", batch_size_s=300, hotword='魔搭') print(res)
paraformer-zh模型对应modelscope上的https://modelscope.cn/models/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary
iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/模型 我跑出来的时间是字级别的? 不能输出句子级别的时间戳吗?
voice_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch该模型不支持预测时间,可以使用
from funasr import AutoModel # paraformer-zh is a multi-functional asr model # use vad, punc, spk or not as you need model = AutoModel(model="paraformer-zh", model_revision="v2.0.2", vad_model="fsmn-vad", vad_model_revision="v2.0.2", punc_model="ct-punc-c", punc_model_revision="v2.0.3", # spk_model="cam++", spk_model_revision="v2.0.2", ) res = model.generate(input=f"{model.model_path}/example/asr_example.wav", batch_size_s=300, hotword='魔搭') print(res)
paraformer-zh模型对应modelscope上的https://modelscope.cn/models/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary
iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/模型 我跑出来的时间是字级别的? 不能输出句子级别的时间戳吗?
由字级别的时间戳到句子级别时间戳涉及到如何分句,这里暂时没有模型或者相关工具来做。只有在开启spk_model的时候为了给说话人标注提供一个单位时用了基于标点的切句。也就是说传入spk_model="cam++", spk_model_revision="v2.0.2"能够获取句子级别的时间戳
voice_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch该模型不支持预测时间,可以使用
from funasr import AutoModel # paraformer-zh is a multi-functional asr model # use vad, punc, spk or not as you need model = AutoModel(model="paraformer-zh", model_revision="v2.0.2", vad_model="fsmn-vad", vad_model_revision="v2.0.2", punc_model="ct-punc-c", punc_model_revision="v2.0.3", # spk_model="cam++", spk_model_revision="v2.0.2", ) res = model.generate(input=f"{model.model_path}/example/asr_example.wav", batch_size_s=300, hotword='魔搭') print(res)
paraformer-zh模型对应modelscope上的https://modelscope.cn/models/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary
iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/模型 我跑出来的时间是字级别的? 不能输出句子级别的时间戳吗?
由字级别的时间戳到句子级别时间戳涉及到如何分句,这里暂时没有模型或者相关工具来做。只有在开启spk_model的时候为了给说话人标注提供一个单位时用了基于标点的切句。也就是说传入spk_model="cam++", spk_model_revision="v2.0.2"能够获取句子级别的时间戳
好的 谢谢!
@jpyjpr 支持了一下这个功能,重新拉取代码按下面的代码尝试
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
# Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)
from funasr import AutoModel
model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
model_revision="v2.0.4",
vad_model="damo/speech_fsmn_vad_zh-cn-16k-common-pytorch",
vad_model_revision="v2.0.4",
punc_model="damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch",
punc_model_revision="v2.0.4",
# spk_model="damo/speech_campplus_sv_zh-cn_16k-common",
# spk_model_revision="v2.0.2",
)
res = model.generate(input="/Users/shixian/Downloads/output_16000.wav",
hotword='达摩院 魔搭',
sentence_timestamp=True,
)
print(res)
@jpyjpr 支持了一下这个功能,重新拉取代码按下面的代码尝试
#!/usr/bin/env python3 # -*- encoding: utf-8 -*- # Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved. # MIT License (https://opensource.org/licenses/MIT) from funasr import AutoModel model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch", model_revision="v2.0.4", vad_model="damo/speech_fsmn_vad_zh-cn-16k-common-pytorch", vad_model_revision="v2.0.4", punc_model="damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch", punc_model_revision="v2.0.4", # spk_model="damo/speech_campplus_sv_zh-cn_16k-common", # spk_model_revision="v2.0.2", ) res = model.generate(input="/Users/shixian/Downloads/output_16000.wav", hotword='达摩院 魔搭', sentence_timestamp=True, ) print(res)
好的! 感谢!!!
@jpyjpr 支持了一下这个功能,重新拉取代码按下面的代码尝试
#!/usr/bin/env python3 # -*- encoding: utf-8 -*- # Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved. # MIT License (https://opensource.org/licenses/MIT) from funasr import AutoModel model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch", model_revision="v2.0.4", vad_model="damo/speech_fsmn_vad_zh-cn-16k-common-pytorch", vad_model_revision="v2.0.4", punc_model="damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch", punc_model_revision="v2.0.4", # spk_model="damo/speech_campplus_sv_zh-cn_16k-common", # spk_model_revision="v2.0.2", ) res = model.generate(input="/Users/shixian/Downloads/output_16000.wav", hotword='达摩院 魔搭', sentence_timestamp=True, ) print(res)
运行报错: Download: iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch failed!: cannot import name 'OfflineModeIsEnabled' from 'huggingface_hub.utils' (c:\Users\loong.conda\envs\nlp\lib\site-packages\huggingface_hub\utils__init__.py)
from funasr import AutoModel
model_path = r'E:\GC\speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch' vad_path = r'E:\GC\speech_fsmn_vad_zh-cn-16k-common-pytorch' punc_path = r'E:\GC\punc_ct-transformer_zh-cn-common-vocab272727-pytorch'
spk_path = r'E:\GC\speech_campplus_sv_zh-cn_16k-common'
timestamp_path = r'E:\GC\speech_timestamp_prediction-v1-16k-offline'
audio_path = r'D:\BaiduNetdiskDownload\audioConvert_484028_1705948729.wav'
model = AutoModel( model=model_path, vad_model=vad_path, punc_model=punc_path, timestamp_model=timestamp_path, ) res = model.generate(input=audio_path)
print(res) [{'key': 'rand_key_2yW4Acq9GFz6Y', 'text': '我是个小兵,我绷紧了神经,在战场上拼命的停,谁在发丝里,将军在微醺,他方向分不清,西方人眼睛他全都听。'}]
如何操作可以生成timestamp,搞了一晚上了