AlexeyAB / yolo2_light

Light version of convolutional neural network Yolo v3 & v2 for objects detection with a minimum of dependencies (INT8-inference, BIT1-XNOR-inference)
MIT License
303 stars 116 forks source link

YOLOv2: Weight quantization per-layer vs per-channel #82

Open kojiwoow opened 4 years ago

kojiwoow commented 4 years ago

Has anyone tested per-channel quantization on YOLOv2?

The report [http://cs231n.stanford.edu/reports/2017/pdfs/808.pdf]() shows, more than 15% mAP drop on YOLOv2 PASCAL VOC 2007 testset, when using 'per-layer' INT8-quantization for all convolutional layers.

The drop was significant, so we decided to use 'per-channel' quantization for weights, described on [https://arxiv.org/pdf/1806.08342.pdf#page=30&zoom=100,0,621]()

For the experiment, we stored the calculated scale factor of l->input_quant_multipler for each channel. Then, we de-quantized for output activation values.

However, compared to the 'per-layer' quantization, the increase was only 0.1% mAP.

I wonder if we did the wrong experiment, or if other people had similar results, since the second paper I posted shows better results on 'per-channel' quantization compared to 'per-layer' quantization for classification networks (ResNet, MobileNet)

The test environment:

Thank you.

gr-rahimi commented 4 years ago

Hi @kojiwoow,

Were you able to find the reason of low improvement with per channel quantization? Did you find any other quantization technique that helps improving the accuracy?

I appreciate if you can share some of your experience with me.

@AlexeyAB , do you have any suggestion to improve the accuracy that we can try? I see that in the YOLOv3, for some of the layers, you refuse to quantize them ( such as the first layer). When we quantized them, we saw huge accuracy drop. Do you have any suggestion to compensate the accuracy loss?

Thanks.