tensorflow / models

Models and examples built with TensorFlow
Other
76.94k stars 45.8k forks source link

SSD MobileNet V2 TFLite model without quantization output is correct, but Post training quantized model inference output is wrong #10618

Open Ramsonjehu opened 2 years ago

Ramsonjehu commented 2 years ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/object_detection

2. Describe the bug

I have trained a custom object detection model using object detection api using pretrained model ssd_mobilenet_v2_320x320_coco17_tpu-8. The trained model has been exported to TFLite-friendly intermediate SavedModel which is used by TFLiteConverter to convert into TFLite format. The converted TFLite model inference is correct. But if i apply post training quantization, the inference output is wrong. Download my repository and run the jupyter notebook to explain the issue im facing.

3. Steps to reproduce

Run the jupyter notebook in the repository to reproduce the issue. Check the non-quantized model output and quantized model output.

4. Expected behavior

The inference result of quantized model should be correct.

5. Additional context

Non-quantized model output: tflite_detection_scores: [0.88126606 0.8636228 0.8444784 0.83444893 0.82267493 0.7586309 0.72747076 0.64393556 0.4792308 0.37496698 0.35456866 0.21228562 0.17112313 0.16550326 0.16282031 0.13535884 0.12606364 0.12315518 0.09494544 0.08832499] tflite_detection_classes: [3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3] tflite_num_detections: 20 tflite_detection_boxes: [0.13898835 0.5581504 0.2226857 0.64273024] [0.37905622 0.51477677 0.57565093 0.6674817 ] [0.10336263 0.36700267 0.16064209 0.4233514 ] [0.17918263 0.47067225 0.25840816 0.5481311 ] [0.19362032 0.26374823 0.28584093 0.33670342] [0.19328472 0.03578638 0.2740516 0.08890636] [0.38459224 0.16602626 0.57033443 0.27695608] [0.08592343 0.43386286 0.14284348 0.48653913] [0.0707082 0.43492046 0.12961109 0.48729852] [0.07322308 0.05297361 0.12850046 0.09446552] [0.05345578 0.34245595 0.10406741 0.3869144 ] [0.19365428 0.0161516 0.26964656 0.04439866] [0.03531459 0.19652161 0.08658443 0.23899463] [0.05851648 0.00280525 0.116818 0.0176958 ] [0.01804176 0.2531051 0.06837644 0.2983545 ] [0.01627679 0.0013474 0.06487057 0.03884105] [0.02952993 0.17788893 0.0826961 0.22692427] [ 0.08686746 -0.00028265 0.1465877 0.00760585] [0.02422631 0.22625658 0.07527018 0.2654607 ] [0.94961053 0.0057383 0.99573547 0.02748624] non-quantized output

Quantized model output: tflite_detection_scores: [0.6016884 0.33555698 0.248775 0.22563314 0.21116948 0.18802762 0.17645669 0.15331483 0.1417439 0.13017297 0.1243875 0.11860204 0.11860204 0.11281657 0.11281657 0.11281657 0.10992384 0.10992384 0.10992384 0.10992384] tflite_detection_classes: [2 2 2 2 3 3 3 3 3 3 3 3 2 3 3 2 3 3 3 3] tflite_num_detections: 20 tflite_detection_boxes: [-0.00831555 0.06652438 1.0061812 1.0061812 ] [-0.01247332 0.25778198 0.65277046 0.9812346 ] [0.07068215 0.4615129 1.0477589 0.9978657 ] [ 0.41993514 -0.00831555 0.9978657 1.0061812 ] [0.43240845 0.84818584 0.6694016 0.9978657 ] [0.6361394 0.9271835 1.0477589 0.9895501] [0.18709981 0.9354991 0.59040385 0.9937079 ] [0.1579954 0.87313247 0.49893284 1.0020235 ] [0.661086 0.8689747 0.9853924 1.0020235] [0.00831555 0.9521302 0.04573551 0.9937079 ] [-0.00415777 0.8565014 0.21204646 1.0020235 ] [0.29104415 0.8648169 0.5654572 1.0020235 ] [0.3326219 0.38667294 0.9479724 0.9770768 ] [0.01247332 0.00415777 0.05405106 0.16215317] [0.33677965 0.9271835 0.6652438 0.9937079 ] [0.03326219 0.00831555 1.0020235 0.4116196 ] [0.00831555 0.8648169 0.04989328 1.0020235 ] [0. 0.9313413 0.42409292 0.9895501 ] [0.29520193 0.57793057 0.49893284 0.8980791 ] [0.61950827 0.00415777 1.0311279 0.06652438] quantized output

6. System information

vyang2968 commented 2 years ago

@Ramsonjehu I would like to help you on this issue as I am trying to achieve the same thing you are. However, I cannot replicate a custom model with ssd_mobilenet_v2_320x320_coco17_tpu-8. I am able to train, evaluate, and export with export_tflite_graph_tf2.py, but when I try to do the post-training quantization, I get the following error: ValueError: Shapes (1, 1, 576, 273) and (1, 1, 576, 9) are incompatible. It would be great if you could share your training process so that I can replicate your issue.

UcefMountacer commented 1 year ago

The output of the quantized one is in INT8 format, values are from 0 to 255 instead of 0.0 to 1.0 (example of detection score). I myself looking to adapt the post processing code or find a code snippet.

saeedadeeb103 commented 1 year ago

@Ramsonjehu I would like to help you on this issue as I am trying to achieve the same thing you are. However, I cannot replicate a custom model with ssd_mobilenet_v2_320x320_coco17_tpu-8. I am able to train, evaluate, and export with export_tflite_graph_tf2.py, but when I try to do the post-training quantization, I get the following error: ValueError: Shapes (1, 1, 576, 273) and (1, 1, 576, 9) are incompatible. It would be great if you could share your training process so that I can replicate your issue.

is there anyway u can share ur code for post-training quantization ?

vyang2968 commented 1 year ago

@saeedadeeb103 I apologize for the late response, but here is my colab notebook https://colab.research.google.com/drive/1y0MpO6FY9k8Og94bKRO7A-Vz9808pz7x?usp=sharing

stefanoinference commented 4 months ago

@vyang2968 I've the same bad result after quantization the model . I requested grant to access to your colab notebook if possibile. Thank's. Stefano