anson0910 / CNN_face_detection

Implementation based on the paper Li et al., “A Convolutional Neural Network Cascade for Face Detection, ” 2015 CVPR
253 stars 148 forks source link

Question about calibration data #19

Closed sparrow0629 closed 8 years ago

sparrow0629 commented 8 years ago

hi~ In your code, when you prepare calibration data, you just use the formula in the paper. For example, when the label is 0, s=0.83, x=-0.17, y=-0.17, the window will go right and down. When you detect a face, the current window is a little bit right and down to the correct window, then you apply calibration net, and the label is 0. Obviously you can not take the same parameter 0.83,-0.17,-0.17 because it will go right and down further.

anson0910 commented 8 years ago

Hi, Label 0 corresponds to s=0.83, x=-0.17, y=-0.17, but as seen in calibration_AFLW.py lines 72-79, for preparing the calibration data, the calibration is applied as an inverse function.

You can see this in 3.4.1 of the original paper: "Specifically, for the n-th pattern [s n , x n , y n ], we apply [1/s n , −x n , −y n ](following Equation 1) to adjust the face annotation bounding box, crop and resize into proper input sizes (12×12, 24×24 and 48×48)."

sparrow0629 commented 8 years ago

Thanks, i found that inverse function. For example, a face's rect is at x:100 y:100 size:200, then prepare the calibration according to the formula, the parameter is s=1/0.83 x=0.17 y=0.17, the rect gets x:72, y:72 size:166. And if you detect this face at (72,72) with size 166, apply the calibration net, and use parameter s=0.83,x=-0.17,y=-0.17, the calibrated position is x:106, y:106 with size 200. Is that a bug?

anson0910 commented 8 years ago

Oh, sorry.. That indeed is a bug, it should be as follows (I have committed it already):

s_n = 1 / cur_scale
x_n = -cur_x / cur_scale
y_n = -cur_y / cur_scale

I just copied that out from the paper without verifying it, many thanks for pointing it out!

hiyijian commented 8 years ago

hi @anson0910 , I think current implementation is still not right. Just notice x_temp. It is actually equal to x + ((cur_x / cur_scale) * w * cur_scale) = x + cur_x * w. new x is independ to scale factor.

hiyijian commented 8 years ago

I am sorry. the moidfication is right. I will re-train calibartaion nets and report my ROCs. Thanks you guys

hiyijian commented 8 years ago

hi guys. I used 100,000 positive sample and 800,000 backgroud image, and add some noise(blur/salt/lighting) to these training images. I achieved 82% recall when FP is 100 on FDDB. see ROC

anson0910 commented 8 years ago

Great!