Open tangyudi opened 8 years ago
Hi, TangYudi. I'm Korean, and I don't have those things. But, if you want, you may ask me something by e-mail, then I will answer you as far as I can.
Hi,thank you and I have a few questions 1.In the paper it says On a single CPU core, it takes 12-net less than 36 ms to densely scan an image of size 800 × 600 for 40 × 40 faces with 4-pixel spacing, which generates 2, 494 detection windows. but i do not know how the 2, 494 is been calculated. 2.The image pyramid is resized by 12/F as the input image for the 12-net.What is the meaning of 12/F and the how many scales do the different scales usually have? 3.I put a 466 x 699 image to the network after resize_image it is 139 x209 and the (out = net_12c_full_conv.blobs['prob'].data[0][1, :, :]) out.shape is 64*99, is each confidence point in 64 x99 means the possibility of a face and if so why a point can represent a rectangle? 4.I use 1W face image and 1W background image without face to train the 12net, is it enough? I am new in face detect. It is really nice of you to help me and can you tell me your email address?
As I described (1), before slide windows, you need to decimate input image by 12 / F. For example, Unless you decimate, there is no detected window at 1st pyramid, so it become meaningless calculations. For # of pyramids, the writer described it on his QnA paper.
http://personal.stevens.edu/~hli18/papers/faq_CVPR2015_CasCNN.html
Are you Chinese?I also study this paper now,Do you have QQ or wechat, i want to ask some questions about the code and the paper.