Different outputs when using engine file generated from trtexec

TrojanXu / yolov5-tensorrt

A tensorrt implementation of yolov5: https://github.com/ultralytics/yolov5

Apache License 2.0

190 stars 46 forks source link

Different outputs when using engine file generated from trtexec #7

Closed makaveli10 closed 4 years ago

makaveli10 commented 4 years ago

I used the below command to generate yolov5_1.engine from the simplified onnx_model

trtexec --onnx=<onnx_file> --explicitBatch --saveEngine=<tensorRT_engine_file> --workspace=<size_in_megabytes> --fp16

which gives me max_diff > 4 which is way too high. And yes I changed using_half=True before 'python main.py' but when I use the same onnx model without explicitly generating engine file max_diff is alright ~ 0.001!

I think its weird!!

TrojanXu commented 4 years ago

In my test, I get same results by prebuilt engine or runtime built engine in fp16 mode. But the diff is around 4 which is similar to your case. I think the max diff comes from the xyhw and I can't evaluate the overall impact on the output after NMS statistically. From my experiment, different workspace size might impact the max diff. It's tricky to debug on FP16 numerical difference though.

makaveli10 commented 4 years ago

So, the default mode is fp32. I guess you are suggesting that we use fp32 for verifying numerical results??

TrojanXu commented 4 years ago

For FP32, it's expected to have little difference on final result based on current diff information. But for FP16, the numerical difference is relatively large. Typically, I can's say this model in FP16 is useless only with maxdiff being as large as 4, because what we want to know is whether AP score on the given evaluation dataset is acceptable. I've added this to README/limitation and try to know if this can be improved in future.