onnx / onnx-tensorrt

ONNX-TensorRT: TensorRT backend for ONNX
Apache License 2.0
2.95k stars 545 forks source link

Dynamic input cannot be continuously predicted #702

Open tink2123 opened 3 years ago

tink2123 commented 3 years ago

Description

Dynamic input cannot be continuously predicted.

I have installed onnx-tensorrt, found that when I need to predict dynamic input continuously, multiple inputs named [profile k] are created. When I only create the trt engine once, there will be a problem of shape mismatch.

Traceback (most recent call last):
  File "trt_inference/predict_det.py", line 238, in <module>
    dt_boxes, _ = text_detector(img)
  File "trt_inference/predict_det.py", line 180, in __call__
    outputs = self.predictor.run([img])
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/backend.py", line 167, in run
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/tensorrt_engine.py", line 138, in run
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/tensorrt_engine.py", line 87, in check_input_validity
ValueError: Wrong shape for input 0. Expected (1, 3, 1024, 1024), got (1, 3, 640, 1024).

when I set the maximum and minimum range in opt_profile.set_shape, the following error will appear:

  File "/workspace/trt_inference/utility.py", line 192, in create_predictor
    engine = backend.prepare(model, device='CUDA:0',verbose=False)
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/backend.py", line 247, in prepare
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/backend.py", line 91, in __init__
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/backend.py", line 138, in _build_engine
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/tensorrt_engine.py", line 119, in __init__
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/tensorrt_engine.py", line 59, in host_buffer
ValueError: negative dimensions are not allowed

Environment

TensorRT Version: 7.2. ONNX-TensorRT Version / Branch: 7.2.1 GPU Type: T4 Nvidia Driver Version: CUDA Version: 10.2 CUDNN Version: 8.0.3 Operating System + Version:

Relevant Files

Steps To Reproduce

model_file_path = model_dir + "/model.onnx"
model = onnx.load(model_file_path)
engine = backend.prepare(model, device='CUDA:0',verbose=False)

...
for img in imgs:
    outputs = engine.run([img])
    print(outputs[0])
kevinch-nv commented 3 years ago

You can build the optimization profiles with a range of dimensions using the min/max/opt profiles. You can see this link for more info.

tink2123 commented 3 years ago

Thx for your reply. I have build the optimization profiles. In my understanding, the prediction engine only needs to be created once, and real pictures of dynamic shapes can be input when running, but this error will be encountered in practice, how can I solve it?

Traceback (most recent call last):
  File "trt_inference/predict_det.py", line 239, in <module>
    dt_boxes, _ = text_detector(img)
  File "trt_inference/predict_det.py", line 181, in __call__
    outputs = self.predictor.run([img])
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/backend.py", line 167, in run
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/onnx_tensorrt-7.2.2.3.0-py3.7.egg/onnx_tensorrt/tensorrt_engine.py", line 145, in run
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/pycuda/gpuarray.py", line 323, in set_async
    return self.set(ary, async_=True, stream=stream)
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/pycuda/gpuarray.py", line 309, in set
    raise ValueError("ary and self must be the same size")
ValueError: ary and self must be the same size

build_engine:

    def _build_engine(self, inputs=None, input_shape=(1,3,640,640)):
        """
        Builds a TensorRT engine with a builder config.
        :param inputs: inputs to the model; if not None, this means we are building the engine at run time,
                       because we need to register optimization profiles for some inputs
        :type inputs: List of np.ndarray
        """
        config = self.builder.create_builder_config()
        config.max_workspace_size = self.max_workspace_size
        if inputs:
            opt_profile = self.builder.create_optimization_profile()
            # Set optimization profiles for the input bindings that need them
            for i in range(self.network.num_inputs):
                inp_tensor = self.network.get_input(i)
                name = inp_tensor.name
                # print("name:", name)
                # Set profiles for shape tensors
                if -1 in inp_tensor.shape:
                    opt_profile.set_shape(name, input_shape, input_shape, input_shape)
            config.add_optimization_profile(opt_profile)

        trt_engine = self.builder.build_engine(self.network, config)

        if trt_engine is None:
            raise RuntimeError("Failed to build TensorRT engine from network")
        if self.serialize_engine:
            trt_engine = self._serialize_deserialize(trt_engine)
        self.engine = Engine(trt_engine)

run:

    def run(self, inputs, **kwargs):
        """Execute the prepared engine and return the outputs as a named tuple.
        inputs -- Input tensor(s) as a Numpy array or list of Numpy arrays.
        """
        if isinstance(inputs, np.ndarray):
            inputs = [inputs]

        # only build engine for first time
        if self.dynamic and self.count<1:
            self._build_engine(inputs=inputs, input_shape=inputs[0].shape)
            self.count += 1
        outputs = self.engine.run(inputs)