modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.19k stars 659 forks source link

VAD模型推理时内存占用太高被KILL,请问能否优化呢 #910

Open manbuheiniu opened 1 year ago

manbuheiniu commented 1 year ago

OS: linux

Python/C++ Version:python 3.7.0

Package Version:pytorch 1.12.1、torchaudio 0.12.1、modelscope 1.8.3、funasr version 0.7.4

Model:damo/speech_fsmn_vad_zh-cn-16k-common-pytorch

Command: ` audio_in = '/home/kingyee/www_fjj/audio2txt/6.wav'

inference_pipeline_vad = pipeline( task=Tasks.voice_activity_detection, model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch', model_revision=None

max_end_silence_time='500ms'

) segments_result = inference_pipeline_vad(audio_in=audio_in) print(segments_result)`

Details:ubuntu 22.04操作系统,内存8G 显卡 T4(16G显存)。推理400M的wav文件时内存占用太多被KILL。显存占用不高

Error log: `023-08-31 03:45:47,169 - modelscope - INFO - PyTorch version 1.12.1 Found. 2023-08-31 03:45:47,170 - modelscope - INFO - Loading ast index from /home/user/.cache/modelscope/ast_indexer 2023-08-31 03:45:47,350 - modelscope - INFO - Loading done! Current index file version is 1.8.3, with md5 f9416d554030cea7be8392223beb4ea9 and a total number of 895 components indexed 2023-08-31 03:45:52,504 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.0 ################################################################################

WARNING, path does not exist: KALDI_ROOT=/mnt/matylda5/iveselyk/Tools/kaldi-trunk

(please add 'export KALDI_ROOT=' in your $HOME/.profile)

(or run as: KALDI_ROOT= python .py)

################################################################################

2023-08-31 03:45:53,045 - modelscope - INFO - initiate model from /home/user/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch 2023-08-31 03:45:53,046 - modelscope - INFO - initiate model from location /home/user/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch. 2023-08-31 03:45:53,047 - modelscope - INFO - initialize model from /home/user/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch 2023-08-31 03:45:53,055 - modelscope - WARNING - No preprocessor field found in cfg. 2023-08-31 03:45:53,055 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2023-08-31 03:45:53,055 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/user/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch'}. trying to build by task and model information. 2023-08-31 03:45:53,055 - modelscope - WARNING - No preprocessor key ('generic-asr', 'voice-activity-detection') found in PREPROCESSOR_MAP, skip building preprocessor. 2023-08-31 03:46:06,822 - modelscope - INFO - VAD Processing ... Killed`

LauraGPT commented 1 year ago

How long is the duration of wav?

manbuheiniu commented 1 year ago

刚查了一下时长3小时53分钟,是不是太长了?如果不好优化请关闭这个问题,谢谢