NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.59k stars 2.11k forks source link

Abnormal of onnx model to trt model in the inference results #4125

Open tricky61 opened 1 week ago

tricky61 commented 1 week ago

I have convert my onnx model to tensorrt, however the result is quit strange. My model is trained in mix precision,when I add the following line, will convert to fp16 onnx model( with some layer weights fp32). @torch.autocast(device_type="cuda", enabled=True) when not add this line, will conver to fp32 onnx model. but neither the fp16 onnx model nor the fp32 onnx model converted to trt model can get the right result. for example when I convert fp32 onnx to trt model, with or without --fp32, the result is almost same, also same of fp16 onnx to trt model's result. so, where is the problem?

lix19937 commented 1 week ago

You can upload the trtexec --verbose build log.

tricky61 commented 1 week ago

You can upload the trtexec --verbose build log.

thanks for your reply. since it is difficult to send files out from my company's computer, I will try to upload the build log later. the model is: https://huggingface.co/nvidia/bigvgan_v2_22khz_80band_256x/tree/main sorry for it is torch format, and I have checked the ops with other model. the only difference is this model has a new activation: SnakeBeta and it used fp16 mix precision. other model with similar ops and fp32 precision works fine. So I don't know it is caused by the fp16 precision or SnakeBeta or something else.

lix19937 commented 1 week ago

sorry for it is torch format,

can upload onnx file ?