modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.46k stars 687 forks source link

onnx offline demo在gpu上运行非常慢,比cpu慢了差不多5倍 #1753

Closed Ignalxy closed 4 months ago

Ignalxy commented 4 months ago

onnx offline demo在gpu上运行非常慢,比cpu慢了差不多5倍

运行pytorch的demo,gpu比cpu快,很正常,但是 运行onnx的demo,device=0推理一段120s的音频,比在cpu上慢了5倍 运行c++版本的也是一样,甚至比python的demo还慢

Code

from funasr_onnx import Paraformer
from pathlib import Path

model_dir = "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
# model_dir = "damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
model = Paraformer(model_dir, batch_size=1, quantize=False)
# model = Paraformer(model_dir, batch_size=1, device_id=0)  # gpu

# when using paraformer-large-vad-punc model, you can set plot_timestamp_to="./xx.png" to get figure of alignment besides timestamps
# model = Paraformer(model_dir, batch_size=1, plot_timestamp_to="test.png")

wav_path = ['/work/lxy/audio/segment_0-120.wav']

result = model(wav_path)
print(result)

What's your environment?

lyblsgo commented 4 months ago

onnx模型在CPU上的优化做得好,GPU上不太行,不建议把onnx的模型部署在GPU上 我们最近会推出一版基于torchscript的GPU的服务部署,到时可以再试一下