Whisper-large-v3 模型，没有时间戳

mp075496706 commented 2 weeks ago

What is your question?

我使用Whisper-large-v3模型识别，能得到结果，但是没有文字的时间戳。

Code

from funasr import AutoModel

model = AutoModel(model="iic/Whisper-large-v3",
                  vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-pytorch"
                  )

DecodingOptions = {
    "task": "transcribe",
    "language": None,
    "beam_size": None,
    "fp16": True,
    "without_timestamps": False,
    "prompt": None,
}
res = model.generate(
    DecodingOptions=DecodingOptions,
    batch_size_s=0,
    input=r"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav",
)

print(res)

以上是我参照whisper文件夹下的demo.py写的代码。

What's your environment?

OS : windows10
FunASR Version : 最新
ModelScope Version : 1.14.0
PyTorch Version : 2.0.0+cu118
How you installed funasr : source
Python version : 3.8.9(64bit)
GPU : RTX3060Ti 8G
CUDA/cuDNN version : cuda11.8

mp075496706 commented 2 weeks ago

识别出来的结果是这样的：[{'key': 'asr_example_zh', 'text': '欢迎大家来体验达摩院推出的语音识别模型。'}] 但是我看模型列表的描述中，Whisper-large-v3是有带时间戳输出这一描述的。所以，是我哪里没有设置对吗？

LauraGPT commented 1 week ago

Wav file is too short.

modelscope / FunASR

Whisper-large-v3 模型，没有时间戳 #1814

What is your question?

Code

What's your environment?