RetinaFace, calibration int8, tensorRT8.6.1 error [pluginV2Runner.cpp::execute::265] Error Code 2: Internal Error (Assertion status == kSTATUS_SCUESS failed. )

ohadjerci commented 8 months ago

Env

docker nvcr.io/nvidia/tensorrt:24.01-py3
GPU, GeForce RTX 2060
OS, Ubuntu18.04
Cuda 12.0
TensorRT 8.6.1.6-1

About this repo

repo wang-xinyu/tensorrtx/retinaface model retinaface

Hello,

The FP16 engine is working but with less performance and with some warning for TR 8.6.1.6

[W] [TRT] - 100 weights are affected by this issue: Detected subnormal FP16 values.
[W] [TRT] - 73 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.

In the other hand, calibration int8 with tr8.6.1.6 leads to errors :

[E] [TRT] 2: Assertion scales.size() == 1 failed.
[E] [TRT] 2: [pluginV2Runner.cpp::getInputHostScale::88] Error Code 2: Internal Error (Assertion scales.size() == 1 failed. )

Firstly, i tried to generate an engine with a trained model (half precision). Secondly, i reorganize the code like yolov9/v7.. but without sucess.

I’m hoping someone can tell me more about this error msg, or point me to documents that can explain it. Does it mean that the building process failed due while processing a plugin or the scale of the interpolation ?

Any suggestion is highly appreciated. Thanks in advance.

wang-xinyu commented 8 months ago

What did you mean of this? 【The FP16 engine is working but with less performance 】

You mean the latency is higher than int8? So that you want to use int8?

ohadjerci commented 8 months ago

> What did you mean of this? 【The FP16 engine is working but with less performance 】 The engine is built but with higher FN > You mean the latency is higher than int8? So that you want to use int8? The error "Assertion status == kSTATUS_SCUESS failed" does not allow engine building

wang-xinyu commented 8 months ago

The engine is built but with higher FN

If the fp16 accuracy is not as expected, then it's meaningless to try int8.

Can you try fp32? And also try lower version of tensorrt. i.e. 8.4

ohadjerci commented 8 months ago

> Can you try fp32? And also try lower version of tensorrt. i.e. 8.4

Yes the calibration int8 for RetinaFace work for previous TR versions, but i would like to know the reason of the error "Assertion scales.size() == 1 failed" with tr 8.6.1. I tried yolov7 and v9 with the same tr version and the calibration int8 work

ohadjerci commented 8 months ago

Another information, we can no longer use the retinaface code from ubuntu 22.04 because only tensorRT version 8.6.1 is compatible.

ohadjerci commented 8 months ago

To reproduce the error, just use docker "nvcr.io/nvidia/tensorrt:24.01-py3", install opencv and launch calib int8 from "https://github.com/wang-xinyu/tensorrtx/tree/master/retinaface"

wang-xinyu commented 8 months ago

Hi @ohadjerci The code was developed on TRT 7.x. I guess there are some operations/layers which are deprecated on TRT 8.6. As of now, we don't have plan to upgrade the code to support TRT 8.6. It would be great if you can debug and solve the issue.

ohadjerci commented 6 months ago

No, the issue is not related to deprecated operations. The solution is to retrain with precision bf16 or fp16 and replace the plugin decoder by the a cpu code.
Also I recommend to use the onnx parser with TRT10

wang-xinyu / tensorrtx

RetinaFace, calibration int8, tensorRT8.6.1 error [pluginV2Runner.cpp::execute::265] Error Code 2: Internal Error (Assertion status == kSTATUS_SCUESS failed. ) #1456

Env

About this repo