Closed mikel-brostrom closed 1 year ago
Btw, I see quite a lot of people asking about exported model results so I though I could post mine here. I exported them with the decode-in-inference
flag in order to minimize model output post-processing. I also had to built a multi-backend class that supported inference of all of the exported models to achieve a meaningful comparison by using exactly the same evaluation pipeline available in this repo for all of them. My results are as follow:
Model | size | mAPval 0.5:0.95 |
mAPval 0.5 |
---|---|---|---|
YOLOX-nano PyTorch | 416 | 0.256 | 0.411 |
YOLOX-nano ONNX | 416 | 0.256 | 0.411 |
YOLOX-nano TFLite FP32 | 416 | 0.256 | 0.411 |
For anybody interested in why this is the case, we are discussing this here: https://github.com/PINTO0309/onnx2tf/issues/244
This seems to be a known critical TF issue. Basically all quantized models don't work when exporting to TFLite by: PyTorch -- (torch.onnx.export) --> ONNX -- (onnx2tf v onnx-tf) --> TFlite. Not sure if this is only the case for model exported by this pipeline or if it is in general. Maybe somebody knows?
Since Float32 is working fine, it is odd that only the INT8 model would break if the Keras model object used to generate the INT8 model in the backend of the tool is the same. YOLOv8 broke the same way. Thus, I can even presume that it is not a conversion flow issue. PyTorch -> ONNX -> TFLite
Thanks for the insights @PINTO0309 :smile:
For the benefit of other engineers' knowledge, I will also post in this thread the workaround needed to eliminate the accuracy degradation due to quantization. It seems that we need to rethink the activation function, etc. significantly and redefine another YOLOX-alpha like model that is not YOLOX to make it work. Thus, differences in the route of conversion were not related to accuracy degradation. SiLU (Swish)
was found to significantly degrade the accuracy of the model during quantization. As an additional research reference, HardSwish
also seems to cause significant accuracy degradation during quantization, as does SiLU (Swish)
.
It is a matter of model structure. The activation function, kernel size and stride for Pooling
, and kernel size and stride for Conv
should be completely revised. See: https://github.com/PINTO0309/onnx2tf/issues/244#issuecomment-1475128445
e.g. YOLOX-Nano https://github.com/TexasInstruments/edgeai-yolox | Before | After |
---|---|---|
Swish /SiLU |
ReLU |
|
DepthwiseConv2D |
Conv2D |
|
MaxPool , kernel_size=5x5,9x9,13x13 |
MaxPool , kernel_size=3x3 |
### Float32 - YOLOX-Nano
(1, 52, 52, 85)
array([[[
[ 0.971787, 0.811184, 0.550566, ..., -5.962632, -7.403673, -6.735206],
[ 0.858804, 1.351296, 1.231673, ..., -6.479690, -8.277064, -7.664936],
[ 0.214827, 1.035119, 1.458006, ..., -6.291425, -8.229385, -7.761562],
...,
[ 0.450116, 1.391900, 1.533354, ..., -5.672194, -7.121591, -6.880231],
[ 0.593133, 2.112723, 0.968755, ..., -6.150078, -7.370633, -6.874294],
[ 0.088263, 1.985220, 0.619998, ..., -5.507928, -6.914980, -6.234259]]]]),
### INT8 - YOLOX-Nano
(1, 52, 52, 85)
array([[[
[ 0.941908, 0.770652, 0.513768, ..., -5.993958, -7.449634, -6.850238],
[ 0.856280, 1.284420, 1.198792, ..., -6.507727, -8.391542, -7.792146],
[ 0.256884, 0.941908, 1.455676, ..., -6.336471, -8.305914, -7.877774],
...,
[ 0.342512, 1.370048, 1.541304, ..., -5.737075, -7.192750, -7.107122],
[ 0.513768, 2.226327, 1.027536, ..., -6.165215, -7.449634, -7.021494],
[ 0.085628, 2.055072, 0.685024, ..., -5.480191, -7.021494, -6.422099]]]]),
This got solved here: https://github.com/PINTO0309/onnx2tf/issues/269. Closing this down!
I have managed to generate :
dynamic_range_quant
,full_integer_quant
andinteger_quant
versions of the TFLite model using onn2tf. However the postprocessing fails for some reason. The confidences are so low that none of the predictions passes through the filtering. Any idea what could be the problem? Thefloat16
andfloat32
TFLite models works as usual, achieving the result in the table below. Anybody triedonn2tf
and got the models working?