A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
E:\Anaconda3\envs\srv2\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
2023-08-15 18:14:31,983 - modelscope - INFO - PyTorch version 1.11.0+cu113 Found.
2023-08-15 18:14:31,986 - modelscope - INFO - TensorFlow version 2.13.0 Found.
2023-08-15 18:14:31,986 - modelscope - INFO - Loading ast index from C:\Users\演示机.cache\modelscope\ast_indexer
2023-08-15 18:14:32,200 - modelscope - INFO - Loading done! Current index file version is 1.8.1, with md5 65607a88c407898863dccee442cbfc94 and a total number of 893 components indexed
2023-08-15 18:14:34,712 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.4
2023-08-15 18:14:34,996 - modelscope - INFO - initiate model from C:\Users\演示机.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
2023-08-15 18:14:34,996 - modelscope - INFO - initiate model from location C:\Users\演示机.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch.
2023-08-15 18:14:35,000 - modelscope - INFO - initialize model from C:\Users\演示机.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
2023-08-15 18:14:35,012 - modelscope - WARNING - No preprocessor field found in cfg.
2023-08-15 18:14:35,012 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-08-15 18:14:35,012 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\演示机\.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch'}. trying to build by task and model information.
2023-08-15 18:14:35,013 - modelscope - WARNING - No preprocessor key ('generic-asr', 'auto-speech-recognition') found in PREPROCESSOR_MAP, skip building preprocessor.
2023-08-15 18:14:35,733 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.0
2023-08-15 18:14:35,940 - modelscope - INFO - loading vad model from C:\Users\演示机.cache\modelscope\hub\damo\speech_fsmn_vad_zh-cn-16k-common-pytorch ...
2023-08-15 18:14:36,110 - modelscope - INFO - Model revision not specified, use the latest revision: v1.1.7
2023-08-15 18:14:36,339 - modelscope - INFO - loading punctuation model from C:\Users\演示机.cache\modelscope\hub\damo\punc_ct-transformer_zh-cn-common-vocab272727-pytorch ...
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "d:\srv2\funasr\funasr\models\predictor\cif.py", line 593, in forward
output2, (_, _) = self.blstm(output2)
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\nn\modules\rnn.py", line 761, in forward
result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
RuntimeError: CUDA out of memory. Tried to allocate 446.00 MiB (GPU 0; 6.00 GiB total capacity; 3.97 GiB already allocated; 0 bytes free; 4.97 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
OS: Windows 10 Python/C++ Version:python 3.8.17 Package Version:pytorch==1.11.0、torchaudio==0.11.0、modelscope==1.8.1、funasr==0.7.3(pip list) Model:damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch Command:pipeline(audio_in='超过一个小时的音频') Details:使用GPU推理时,GPU内存溢出,GPU剩余内存6G,想问下可以控制GPU占用内存大小么,用GPU推理短音频时是正常的 Error log:
E:\Anaconda3\envs\srv2\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
2023-08-15 18:14:31,983 - modelscope - INFO - PyTorch version 1.11.0+cu113 Found.
2023-08-15 18:14:31,986 - modelscope - INFO - TensorFlow version 2.13.0 Found.
2023-08-15 18:14:31,986 - modelscope - INFO - Loading ast index from C:\Users\演示机.cache\modelscope\ast_indexer
2023-08-15 18:14:32,200 - modelscope - INFO - Loading done! Current index file version is 1.8.1, with md5 65607a88c407898863dccee442cbfc94 and a total number of 893 components indexed
Namespace(audio_file='D:\SRV2\庭审材料2.wav', frame_rate=16000, gpu=1)
2023-08-15 18:14:34,712 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.4
2023-08-15 18:14:34,996 - modelscope - INFO - initiate model from C:\Users\演示机.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
2023-08-15 18:14:34,996 - modelscope - INFO - initiate model from location C:\Users\演示机.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch.
2023-08-15 18:14:35,000 - modelscope - INFO - initialize model from C:\Users\演示机.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
2023-08-15 18:14:35,012 - modelscope - WARNING - No preprocessor field found in cfg.
2023-08-15 18:14:35,012 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-08-15 18:14:35,012 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\演示机\.cache\modelscope\hub\damo\speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch'}. trying to build by task and model information.
2023-08-15 18:14:35,013 - modelscope - WARNING - No preprocessor key ('generic-asr', 'auto-speech-recognition') found in PREPROCESSOR_MAP, skip building preprocessor.
2023-08-15 18:14:35,733 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.0
2023-08-15 18:14:35,940 - modelscope - INFO - loading vad model from C:\Users\演示机.cache\modelscope\hub\damo\speech_fsmn_vad_zh-cn-16k-common-pytorch ...
2023-08-15 18:14:36,110 - modelscope - INFO - Model revision not specified, use the latest revision: v1.1.7
2023-08-15 18:14:36,339 - modelscope - INFO - loading punctuation model from C:\Users\演示机.cache\modelscope\hub\damo\punc_ct-transformer_zh-cn-common-vocab272727-pytorch ...
################################################################################
WARNING, path does not exist: KALDI_ROOT=/mnt/matylda5/iveselyk/Tools/kaldi-trunk
(please add 'export KALDI_ROOT=' in your $HOME/.profile)
(or run as: KALDI_ROOT= python .py)
################################################################################
2023-08-15 18:15:02,142 - modelscope - INFO - Decoding with wav files ...
batch_size_token: 6000
time cost vad: 56.4520480632782
batch: 88
time cost asr: 2.9561240673065186
batch: 39
time cost asr: 1.0192680358886719
batch: 29
time cost asr: 0.9634253978729248
batch: 24
time cost asr: 1.0352020263671875
batch: 21
time cost asr: 1.0262844562530518
batch: 18
time cost asr: 1.07608962059021
batch: 16
time cost asr: 1.137986660003662
batch: 14
time cost asr: 1.2157480716705322
batch: 12
Traceback (most recent call last):
File "srv2_offline.py", line 70, in
File "srv2_offline.py", line 56, in convert_voice_to_text
File "E:\Anaconda3\envs\srv2\lib\site-packages\modelscope\pipelines\audio\asr_inference_pipeline.py", line 256, in call
File "E:\Anaconda3\envs\srv2\lib\site-packages\modelscope\pipelines\audio\asr_inference_pipeline.py", line 505, in forward
File "E:\Anaconda3\envs\srv2\lib\site-packages\modelscope\pipelines\audio\asr_inference_pipeline.py", line 580, in run_inference
File "d:\srv2\funasr\funasr\bin\asr_inference_launch.py", line 660, in _forward
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
File "d:\srv2\funasr\funasr\bin\asr_infer.py", line 438, in call
File "d:\srv2\funasr\funasr\models\e2e_asr_paraformer.py", line 1637, in calc_predictor
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
File "d:\srv2\funasr\funasr\models\predictor\cif.py", line 593, in forward
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
File "E:\Anaconda3\envs\srv2\lib\site-packages\torch\nn\modules\rnn.py", line 761, in forward
RuntimeError: CUDA out of memory. Tried to allocate 446.00 MiB (GPU 0; 6.00 GiB total capacity; 3.97 GiB already allocated; 0 bytes free; 4.97 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF