output result error of TensorRT 10.0.1.6 when running conv+clip structure on GPU NVIDIA L4

yikox commented 3 months ago

Description

My model encountered a result error when using tensorrt acceleration. The positioning found that it was a calculation error of the conv+clip graph structure. I created a small model with only a simple conv and a clip operator that could reproduce this problem。

Environment

*TensorRT Version10.0.1.6*:

*NVIDIA GPUNVIDIA L4*:

*NVIDIA Driver Version535.129.03*:

*CUDA Version 11.8*:

*CUDNN Versionlibcudnn.so.8.9.6*:

Operating System: Python Version (if applicable): 3.10.13

PyTorch Version (if applicable):2.1.2

Relevant Files

tensorrt ConvClip.onnx.zip

Steps To Reproduce

Compile onnx models to trt eigine
Load an image, adjust it to a tensor of (1,3,512,512), and normalize it to (-0.5,0.5) as model input
run in trt
get trt output and +0.5 to normalize to (0,1)
save output image

lix19937 commented 3 months ago

Ple use polygraphy to compare fp32 /fp16 with onnxruntime.

ttyio commented 1 month ago

@yikox have you tried disable TF32?

       export NVIDIA_TF32_OVERRIDE=0

https://deeprec.readthedocs.io/en/latest/NVIDIA-TF32.html

thanks!

moraxu commented 3 weeks ago

@yikox , I will be closing this ticket due to our policy to close tickets with no activity for more than 21 days after a reply had been posted. Please reopen a new ticket if you still need help.

NVIDIA / TensorRT