AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.63k stars 7.95k forks source link

Status of yolo2_light #1417

Open tlind opened 6 years ago

tlind commented 6 years ago

Asking here because issue tracking is disabled in the other project: I'd like to use yolo2_light for a robotics research project because it's more light-weight, and I am wondering if its inference performance is on par with this fork of darknet (using yolov3.cfg). Also, what's the license?

AlexeyAB commented 6 years ago

About: https://github.com/AlexeyAB/yolo2_light It looks like this is MIT license: https://github.com/AlexeyAB/yolo2_light/blob/master/LICENSE

I am wondering if its inference performance is on par with this fork of darknet (using yolov3.cfg).

In general yes, except:

Also:

yolo2_light - I just added XNOR-net (weights, inputs, calculations: bit-1 instead of float-32) on CPU, in the same way as current repository https://github.com/AlexeyAB/darknet It is about 4x times acceleration on CPU AVX2 (will be improved), but about -30% precision mAP. Model should be trained by using such cfg-file: https://github.com/AlexeyAB/yolo2_light/blob/master/bin/tiny-yolo-obj_xnor.cfg

yolo2_light supports Yolo v2 and v3 models.

Also there are difference in commands - specify names-file instead of data-file:


For quantization - calculation on Int-8 instead of Float-32, ~+30% speedup and -1% mAP: https://github.com/AlexeyAB/yolo2_light

mathieuorhan commented 6 years ago

Very interested in this project too, as I need to run my detector on CPU/small GPU (for autonomous driving). In the original paper of XNOR-Net, the authors achieve a huge speed increase for a small decrease in accuracy. Do you think there is a big room for improvement of the current yolo_light, in both 8bits and 1bit quantization ?

AlexeyAB commented 6 years ago

@mathieuorhan For further optimizations:


Also:


The general line for achievement optimal accuracy+speed of CNN is:

So better to use 320 layers with bit-1 weights instead of 10 layers with float-32 weights, accuracy and speed will be higher, with the same model size.

Also it looks like that Yann LeCun's statement is not true for modern networks with many layers and connections. Since he wrote it for old networks with low number of layers.

But Yann LeCun considers that, for complex tasks need a minimum of 8-bits: But to get good results on a task like ImageNet you need about 8 bit of precision on the neuron states.: https://www.facebook.com/yann.lecun/posts/10152184295832143

Somewhere there was talk about 4-bit optimal weights and linear initialization: https://github.com/AlexeyAB/darknet/issues/138

mathieuorhan commented 6 years ago

@AlexeyAB Thank you for the insights, this is very interesting. I'm looking forward for progress and new tests. In the next few weeks I'm going to test different settings with YOLO_light and post feedback.

tlind commented 5 years ago

Great, thanks a lot for the clarification!