funasr-runtime-sdk-gpu-0.1.1版本运行过程中出现异常

liurongjie174 commented 2 months ago

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

多个客户端同时对funasr-runtime-sdk-gpu-0.1.1发送离线语音文件识别时，fun_asr_server服务端很容易出现异常崩溃。具体如下截图所示： 1崩溃，观察GPU显存还是很正常的。24G的显存才使用了4G多

Code sample

Expected behavior

Environment

OS (e.g., Linux):
FunASR Version (e.g., 1.0.0):
ModelScope Version (e.g., 1.11.0):
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source):
Python version:
GPU (e.g., V100M32)
CUDA/cuDNN version (e.g., cuda11.7):
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

Additional context

xiaoheiNLP commented 2 months ago

你想跑GPU，但是你的镜像是CPU版本的呀

liurongjie174 commented 2 months ago

我是通过docker run --gpus=all -p 10095:10095 -it --privileged=true \ -v $PWD/funasr-runtime-resources/models:/workspace/models \ registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1 和 bash run_server.sh \ --download-model-dir /workspace/models \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \ --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \ --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \ --itn-dir thuduj12/fst_itn_zh \ --hotword /workspace/models/hotwords.txt > log.txt 2>&1 & 启动的服务,这是cpu版本的？

liurongjie174 commented 2 months ago

确定是GPU版本镜像，多个客户端（2个以上）同时请求的时候（遍历读取100多个录音），可能运行一阵子完，fun_asr_server服务端没有任何响应了，看gpu的显存已经变成1MB，实际上服务已经down掉了。该问题很容易复现

modelscope / FunASR