spritelw commented 8 months ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

from funasr import AutoModel

model = AutoModel(model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch", model_revision="v2.0.4", vad_model="damo/speech_fsmn_vad_zh-cn-16k-common-pytorch", vad_model_revision="v2.0.4", punc_model="damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch", punc_model_revision="v2.0.4",

spk_model="damo/speech_campplus_sv_zh-cn_16k-common",

              # spk_model_revision="v2.0.2",
              )

res = model.generate(input="https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav", hotword='达摩院魔搭',

sentence_timestamp=True,

res = model.generate(input="https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav", hotword='达摩院魔搭',

sentence_timestamp=True,

接口调用路径

generate->inference_with_vad->inference ->model.inference(batch, kwargs)

以下是接口部分代码： def inference(self, data_in, data_lengths=None, key: list = None, tokenizer=None, frontend=None, cache: dict = {}, <---由kwargs展开，并保存，最终是保存在 self.vad_kwargs **kwargs, ):

    if len(cache) == 0: <--第二次调用时 len(cache) > 0
        self.init_cache(cache, **kwargs)

会存在两种错误：

mat = kaldi.fbank(waveform, File "/home/sd/miniconda3/envs/cuda121_funasr1.0/lib/python3.10/site-packages/torchaudio/compliance/kaldi.py", line 591, in fbank waveform, window_shift, window_size, padded_window_size = _get_waveform_and_window_properties( File "/home/sd/miniconda3/envs/cuda121_funasr1.0/lib/python3.10/site-packages/torchaudio/compliance/kaldi.py", line 142, in _get_waveform_and_window_properties assert 2 <= window_size <= len(waveform), "choose a window size {} that is [2, {}]".format( AssertionError: choose a window size 400 that is [2, 0]
File "/home/sd/transformer/FunASR-git/funasr/models/fsmn_vad_streaming/model.py", line 443, in GetFrameState cur_decibel = cache["stats"].decibel[t] IndexError: list index out of range

Expected behavior

模型内部不应该保存中间结果，作为服务要符合幂等

Environment

OS (e.g., Linux):
FunASR Version (e.g., 1.0.0):
ModelScope Version (e.g., 1.11.0):
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source):
Python version:
GPU (e.g., V100M32)
CUDA/cuDNN version (e.g., cuda11.7):
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

Additional context

LauraGPT commented 8 months ago

I have tested it withou any errors. Please list your envs. The cache would be reset in https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/models/fsmn_vad_streaming/model.py#L623

spritelw commented 8 months ago

wav没问题，我传入的是pcm数据，所以没有final

LauraGPT commented 8 months ago

You should set in_final=True, such as .demo

ZhanRao commented 1 month ago

You should set in_final=True, such as .demo 这个好像没有解决问题，运行一段时间又会报错

modelscope / FunASR

automodel做后台服务，多次调用model.generate出现错 #1326

🐛 Bug

To Reproduce

spk_model="damo/speech_campplus_sv_zh-cn_16k-common",

sentence_timestamp=True,

sentence_timestamp=True,

Expected behavior

Environment

Additional context