NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.82k stars 2.14k forks source link

Poor performance on trt engine compared to ONNX model #2296

Closed omri-cavnue closed 2 years ago

omri-cavnue commented 2 years ago

Description

Environment • Hardware Platform (Jetson / GPU) : Jetson Xavier NX • DeepStream Version : 6.0 • JetPack Version : Jetpack 4.6.1 & L4T 32.6.1 • TensorRT Version : 8.0.1.6-1+cuda10.2

I convert my ONNX model to a TensorRT engine using Deepstream and save the engine file. However, the results are much worse when compared to the ONNX model. This is an object detection task, and the MAP on the Jetson is significantly lower than if I run the ONNX at desktop

zerollzeng commented 2 years ago

Can you share the reproduce for this issue? or if it's highly related to deepstream, then ask for help in https://forums.developer.nvidia.com/c/accelerated-computing/intelligent-video-analytics/deepstream-sdk/15 would be better.

zerollzeng commented 2 years ago

to check the actual performance, you can use trtexec to check the inference time and see if it match what you observe in DS.

/usr/src/tensorrt/bin/trtexec --onnx=model.onnx
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --fp16
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --fp16 --int8
omri-cavnue commented 2 years ago

Hi @zerollzeng, this turned out to be an issue with deepstream pre/post-processing that I was able able to resolve. So we can close this ticket