NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.71k stars 2.12k forks source link

How to improve the accuracy of FP16 model ? #4168

Open EmmaThompson123 opened 3 weeks ago

EmmaThompson123 commented 3 weeks ago

I used Polygraphy to compare the accuracy of ONNX FP32 and TensorRT FP16 with following command :

polygraphy run weights/model.onnx \
    --onnxrt --trt \
    --workspace=4096M \ 
    --atol 1e-3 --rtol 1e-3 \
    --verbose \
    --onnx-outputs mark all \
    --trt-outputs mark all \
    --input-shapes input1:[4,1,80,16] input2:[4,6,256,256] \
    > result-run-FP32-MarkAll_2.txt

and the output log showed that some nodes failed :

...
[I]         Error Metrics: 532
[I]             Minimum Required Tolerance: elemwise error | [abs=0.0023232] OR [rel=2.0504]
[I]             Absolute Difference | Stats: mean=0.00022035, std-dev=0.00018821, var=3.5422e-08, median=0.00017238, min=0 at (0, 23, 3, 30), max=0.0023232 at (1, 42, 29, 13), avg-magnitude=0.00022035
[I]                 ---- Histogram ----
                    Bin Range            |  Num Elems | Visualization
                    (0       , 0.000232) |     984969 | ########################################
                    (0.000232, 0.000465) |     422592 | #################
                    (0.000465, 0.000697) |     124028 | #####
                    (0.000697, 0.000929) |      31855 | #
                    (0.000929, 0.00116 ) |       7443 | 
                    (0.00116 , 0.00139 ) |       1613 | 
                    (0.00139 , 0.00163 ) |        317 | 
                    (0.00163 , 0.00186 ) |         39 | 
                    (0.00186 , 0.00209 ) |          5 | 
                    (0.00209 , 0.00232 ) |          3 | 
[I]             Relative Difference | Stats: mean=0.0031998, std-dev=0.62169, var=0.3865, median=0.00034169, min=0 at (0, 23, 3, 30), max=755.42 at (3, 23, 10, 6), avg-magnitude=0.0031998
[I]                 ---- Histogram ----
                    Bin Range    |  Num Elems | Visualization
                    (0   , 75.5) |    1572863 | ########################################
                    (75.5, 151 ) |          0 | 
                    (151 , 227 ) |          0 | 
                    (227 , 302 ) |          0 | 
                    (302 , 378 ) |          0 | 
                    (378 , 453 ) |          0 | 
                    (453 , 529 ) |          0 | 
                    (529 , 604 ) |          0 | 
                    (604 , 680 ) |          0 | 
                    (680 , 755 ) |          1 | 
[E]         FAILED | Difference exceeds tolerance (rel=0.001, abs=0.001)
...

Visually comparing the images generated by these two models, there indeed seems to be a slight difference between the two. Now I want to know if there are any ways to improve the accuracy of FP16 tensorrt model ?

lix19937 commented 3 weeks ago

This error-diff seems acceptable. You can set some layers in fp32.

yuanyao-nv commented 2 weeks ago

What's the accuracy comparison of the FP32 model?