Open Rus-L opened 6 years ago
This extension to the Darknet framework aims at providing low precision inference for Darknet models. Currently the underlaying arithmetic is still in floating point. However, the model file can be stored using 8b or 16b format which helps to reduce the model size by a factor of 4 or 2 respectively.
It doesn't speedup detection because arithmetic is still in FP32, but it decreases precision as you see on the charts: https://github.com/gplhegde/darknet/tree/master/extra
It will only reduce size of file yolov2.weights
from 194 MB to 49 MB, will decrease precision, but even willn't decrease RAM usage, and willn't increase speed.
Because Raspberry PI 3 GPU doesn't fully support OpenCL (as I know), much faster will be to use Yolo-v2 / Tiny-yolo-v2 on CPU that I added inside OpenCV 3.4.0 dnn-module - examples:
I already added FP16 calcuation for nVidia GPU Volta with Tensor Cores: https://github.com/AlexeyAB/darknet/issues/407#issuecomment-381605047
I will add here INT8-quantinization and INT16/32 forward inference for detection: https://github.com/AlexeyAB/darknet/issues/726#issuecomment-385651416
@Rus-L I added support of INT8 Detection for nVidia GPU CC >= 6.1 Pascal and higher (with DP4A) in this repository: https://github.com/AlexeyAB/yolo2_light
Just use such command:
./darknet detector demo voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 test.mp4 -quantized
yolo_gpu.exe detector demo voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 test.mp4 -quantized
or
./darknet detector test voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 dog.jpg -quantized
yolo_gpu.exe detector test voc.names yolo-voc.cfg yolo-voc.weights -thresh 0.24 dog.jpg -quantized
If you want to use quantization for your custom cfg-file, then you should copy input_callibration params from correspond cfg-file:
yolo-voc.cfg
https://github.com/AlexeyAB/yolo2_light/blob/781983eb4186d83e473c570818e17b0110a309da/bin/yolo-voc.cfg#L17
tiny-yolo-voc.cfg
https://github.com/AlexeyAB/yolo2_light/blob/781983eb4186d83e473c570818e17b0110a309da/bin/tiny-yolo-voc.cfg#L16
yolov3-tiny.cfg
https://github.com/AlexeyAB/yolo2_light/blob/781983eb4186d83e473c570818e17b0110a309da/bin/yolov3-tiny.cfg#L25
This fork contains interesting results of model conversion: https://github.com/gplhegde/darknet/tree/master/extra We can expect similar implementation in your fork ? I think it's very helpful to use YOLO3 on the Raspberry PI