AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.67k stars 7.96k forks source link

Low precision inference for Darknet models. Convert model to 16b #762

Open Rus-L opened 6 years ago

Rus-L commented 6 years ago

This fork contains interesting results of model conversion: https://github.com/gplhegde/darknet/tree/master/extra We can expect similar implementation in your fork ? I think it's very helpful to use YOLO3 on the Raspberry PI

AlexeyAB commented 6 years ago

This extension to the Darknet framework aims at providing low precision inference for Darknet models. Currently the underlaying arithmetic is still in floating point. However, the model file can be stored using 8b or 16b format which helps to reduce the model size by a factor of 4 or 2 respectively.

It doesn't speedup detection because arithmetic is still in FP32, but it decreases precision as you see on the charts: https://github.com/gplhegde/darknet/tree/master/extra

It will only reduce size of file yolov2.weights from 194 MB to 49 MB, will decrease precision, but even willn't decrease RAM usage, and willn't increase speed.

Because Raspberry PI 3 GPU doesn't fully support OpenCL (as I know), much faster will be to use Yolo-v2 / Tiny-yolo-v2 on CPU that I added inside OpenCV 3.4.0 dnn-module - examples:


AlexeyAB commented 6 years ago

@Rus-L I added support of INT8 Detection for nVidia GPU CC >= 6.1 Pascal and higher (with DP4A) in this repository: https://github.com/AlexeyAB/yolo2_light

Just use such command: ./darknet detector demo voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 test.mp4 -quantized yolo_gpu.exe detector demo voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 test.mp4 -quantized

or ./darknet detector test voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 dog.jpg -quantized yolo_gpu.exe detector test voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 dog.jpg -quantized


If you want to use quantization for your custom cfg-file, then you should copy input_callibration params from correspond cfg-file: