I'm using onnx-tensorrt to convert an ONNX asr model SenseVoiceSmall to TRT engine. The first call to engine.run works and produces results, but subsequent calls result in the following error:
Output shape: (-1, -1, 25055)
Found dynamic inputs! Deferring engine build to run stage
using trt
['<|zh|><|NEUTRAL|><|Speech|><|woitn|>谷歌不仅会以优厚薪酬招募顶尖人才']
num: 64, time, 22.713175535202026, avg: 22.713175535202026, rtf: 4.107265015407238
using trt
Traceback (most recent call last):
File "/root/rtf.py", line 307, in <module>
result = model(wav_path)
File "/root/rtf.py", line 182, in __call__
ctc_logits, encoder_out_lens = self.infer(
File "/root/rtf.py", line 251, in infer
outputs = self.engine.run([feats, feats_len, language, textnorm])
File "/root/anaconda3/lib/python3.10/site-packages/onnx_tensorrt-10.2.0-py3.10.egg/onnx_tensorrt/backend.py", line 156, in run
File "/root/anaconda3/lib/python3.10/site-packages/onnx_tensorrt-10.2.0-py3.10.egg/onnx_tensorrt/backend.py", line 134, in _build_engine
File "/root/anaconda3/lib/python3.10/site-packages/onnx_tensorrt-10.2.0-py3.10.egg/onnx_tensorrt/backend.py", line 144, in _deserialize
AttributeError: parser
I suspect this issue might be related to the use of dynamic inputs. Does onnx-tensorrt support dynamic inputs? How can I resolve this?
Environment
TensorRT Version: 10.2.0.19-1+cuda11.8
ONNX-TensorRT Version / Branch: 10.2-GA
GPU Type: A800
Nvidia Driver Version: 535.54.03
CUDA Version: 11.8
CUDNN Version: 8.9.2
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.10.9
TensorFlow + TF2ONNX Version (if applicable):
PyTorch Version (if applicable): 2.4.0+cu118
Baremetal or Container (if container which image + tag):
here is the source code for model conversion:
# model init
model_file = os.path.join(model_dir, "model.onnx")
if quantize:
model_file = os.path.join(model_dir, "model_quant.onnx")
if not os.path.exists(model_file):
print(".onnx does not exist, begin to export onnx")
try:
from funasr import AutoModel
except:
raise "You are exporting onnx, please install funasr and try it again. To install funasr, you could:\n" "\npip3 install -U funasr\n" "For the users in China, you could install with the command:\n" "\npip3 install -U funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple"
model = AutoModel(model=model_dir)
model_dir = model.export(type="onnx", quantize=quantize, **kwargs)
if use_trt:
import onnx_tensorrt.backend as backend
model = onnx.load(model_file)
engine = backend.prepare(model, device_id=device_id,verbose=True,**kwargs)
self.engine = engine
# inference
if self.use_trt:
print("using trt")
outputs = self.engine.run([feats, feats_len, language, textnorm])
return outputs
else:
outputs = self.ort_infer([feats, feats_len, language, textnorm])
return outputs
...
Description
I'm using onnx-tensorrt to convert an ONNX asr model SenseVoiceSmall to TRT engine. The first call to engine.run works and produces results, but subsequent calls result in the following error:
I suspect this issue might be related to the use of dynamic inputs. Does onnx-tensorrt support dynamic inputs? How can I resolve this?
Environment
TensorRT Version: 10.2.0.19-1+cuda11.8 ONNX-TensorRT Version / Branch: 10.2-GA GPU Type: A800 Nvidia Driver Version: 535.54.03 CUDA Version: 11.8 CUDNN Version: 8.9.2 Operating System + Version: Ubuntu 20.04 Python Version (if applicable): 3.10.9 TensorFlow + TF2ONNX Version (if applicable): PyTorch Version (if applicable): 2.4.0+cu118 Baremetal or Container (if container which image + tag):
here is the source code for model conversion:
Relevant Files
Steps To Reproduce