NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.61k stars 2.11k forks source link

output result error of TensorRT 10.0.1.6 when running conv+clip structure on GPU NVIDIA L4 #3938

Closed yikox closed 3 weeks ago

yikox commented 3 months ago

Description

My model encountered a result error when using tensorrt acceleration. The positioning found that it was a calculation error of the conv+clip graph structure. I created a small model with only a simple conv and a clip operator that could reproduce this problem。

image

image

Environment

*TensorRT Version10.0.1.6*:

*NVIDIA GPUNVIDIA L4*:

*NVIDIA Driver Version535.129.03*:

*CUDA Version 11.8*:

*CUDNN Versionlibcudnn.so.8.9.6*:

Operating System: Python Version (if applicable): 3.10.13

PyTorch Version (if applicable):2.1.2

Relevant Files

tensorrt ConvClip.onnx.zip

Steps To Reproduce

  1. Compile onnx models to trt eigine
  2. Load an image, adjust it to a tensor of (1,3,512,512), and normalize it to (-0.5,0.5) as model input
  3. run in trt
  4. get trt output and +0.5 to normalize to (0,1)
  5. save output image
lix19937 commented 3 months ago

Ple use polygraphy to compare fp32 /fp16 with onnxruntime.

ttyio commented 1 month ago

@yikox have you tried disable TF32?

       export NVIDIA_TF32_OVERRIDE=0

https://deeprec.readthedocs.io/en/latest/NVIDIA-TF32.html

thanks!

moraxu commented 3 weeks ago

@yikox , I will be closing this ticket due to our policy to close tickets with no activity for more than 21 days after a reply had been posted. Please reopen a new ticket if you still need help.