Yolov4 quantization and latency

hunglc007 / tensorflow-yolov4-tflite

YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite

https://github.com/hunglc007/tensorflow-yolov4-tflite

MIT License

2.25k stars 1.24k forks source link

Yolov4 quantization and latency #256

Open ARooney85 opened 4 years ago

ARooney85 commented 4 years ago

After TF Lite quantization the size of Yolov4 tiny model is reduced indeed. But the latency is increasing. For dynamic-range quantization up to 2-3 times. For int8 - up to 4-5 times. I tested it on desktop linux (x86-64) and Raspberry 3 (armv7). The result is same. Is it the problem that TF Lite optimizer doesn't support Yolov4 tiny layers?

rossGardiner commented 3 years ago

Does anyone have any insight on this? I'm wondering what datatypes to quantize for for acceleration on arm devices like RPi...