modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
5.88k stars 636 forks source link

for one same audio, if i run generate for the second time, the processing time is very long, the cpu consumption is much larger than before #1666

Open animebing opened 4 months ago

animebing commented 4 months ago

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

What is your question?

for one same audio(the audio file is long, such as 60 minutes), if i run generate for the second time, the processing time is very long, the cpu consumption is much larger than before. short audio files(such as one minute) will not have such problem

Code

import os
import time

from funasr import AutoModel

model_dir = 'path/to/model'
model = AutoModel(
    model=os.path.join(model_dir, 'speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch'),
    vad_model=os.path.join(model_dir, 'speech_fsmn_vad_zh-cn-16k-common-pytorch'),
    punc_model=os.path.join(model_dir, 'punc_ct-transformer_cn-en-common-vocab471067-large'),
    spk_model=os.path.join(model_dir, 'speech_campplus_sv_zh-cn_16k-common'),
    disable_pbar=True,
)

audio_path = 'path/to/audio'
for i in range(5):
    time_0 = time.time()
    res = model.generate(input=audio_path, batch_size_s=300)
    time_1 = time.time()
    print(f'process: {time_1 - time_0}')

What have you tried?

What's your environment?

LauraGPT commented 4 months ago

You could delete the spk_model=os.path.join(model_dir, 'speech_campplus_sv_zh-cn_16k-common'), and try it again.

animebing commented 4 months ago

@LauraGPT It works when i disable spk_model, but i really need to use that model, what should i do to make it work, thanks