onnx / onnx-tensorrt

ONNX-TensorRT: TensorRT backend for ONNX
Apache License 2.0
2.95k stars 545 forks source link

AttributeError: parser after First Run, Possibly Due to Dynamic Inputs. #989

Open liusiqian-tal opened 3 months ago

liusiqian-tal commented 3 months ago

Description

I'm using onnx-tensorrt to convert an ONNX asr model SenseVoiceSmall to TRT engine. The first call to engine.run works and produces results, but subsequent calls result in the following error:

Output shape: (-1, -1, 25055)
Found dynamic inputs! Deferring engine build to run stage
using trt
['<|zh|><|NEUTRAL|><|Speech|><|woitn|>谷歌不仅会以优厚薪酬招募顶尖人才']
num: 64, time, 22.713175535202026, avg: 22.713175535202026, rtf: 4.107265015407238
using trt
Traceback (most recent call last):
  File "/root/rtf.py", line 307, in <module>
    result = model(wav_path)
  File "/root/rtf.py", line 182, in __call__
    ctc_logits, encoder_out_lens = self.infer(
  File "/root/rtf.py", line 251, in infer
    outputs = self.engine.run([feats, feats_len, language, textnorm])
  File "/root/anaconda3/lib/python3.10/site-packages/onnx_tensorrt-10.2.0-py3.10.egg/onnx_tensorrt/backend.py", line 156, in run
  File "/root/anaconda3/lib/python3.10/site-packages/onnx_tensorrt-10.2.0-py3.10.egg/onnx_tensorrt/backend.py", line 134, in _build_engine
  File "/root/anaconda3/lib/python3.10/site-packages/onnx_tensorrt-10.2.0-py3.10.egg/onnx_tensorrt/backend.py", line 144, in _deserialize
AttributeError: parser

I suspect this issue might be related to the use of dynamic inputs. Does onnx-tensorrt support dynamic inputs? How can I resolve this?

Environment

TensorRT Version: 10.2.0.19-1+cuda11.8 ONNX-TensorRT Version / Branch: 10.2-GA GPU Type: A800 Nvidia Driver Version: 535.54.03 CUDA Version: 11.8 CUDNN Version: 8.9.2 Operating System + Version: Ubuntu 20.04 Python Version (if applicable): 3.10.9 TensorFlow + TF2ONNX Version (if applicable): PyTorch Version (if applicable): 2.4.0+cu118 Baremetal or Container (if container which image + tag):

here is the source code for model conversion:

# model init
model_file = os.path.join(model_dir, "model.onnx")
if quantize:
    model_file = os.path.join(model_dir, "model_quant.onnx")
if not os.path.exists(model_file):
    print(".onnx does not exist, begin to export onnx")
    try:
        from funasr import AutoModel
    except:
        raise "You are exporting onnx, please install funasr and try it again. To install funasr, you could:\n" "\npip3 install -U funasr\n" "For the users in China, you could install with the command:\n" "\npip3 install -U funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple"

    model = AutoModel(model=model_dir)
    model_dir = model.export(type="onnx", quantize=quantize, **kwargs)

if use_trt:
    import onnx_tensorrt.backend as backend
    model = onnx.load(model_file)
    engine = backend.prepare(model, device_id=device_id,verbose=True,**kwargs)
    self.engine = engine

# inference
if self.use_trt:
    print("using trt")
    outputs = self.engine.run([feats, feats_len, language, textnorm])
    return outputs
else:
    outputs = self.ort_infer([feats, feats_len, language, textnorm])
    return outputs
...

Relevant Files

Steps To Reproduce