sfzhang15 / SFD

S³FD: Single Shot Scale-invariant Face Detector, ICCV, 2017
521 stars 135 forks source link

'max_im_shrink' in wider face test #73

Closed fengyuentau closed 5 years ago

fengyuentau commented 5 years ago

I read the wider face test code in the wider_test.py, where the calculation of max_im_shrink is:

max_im_shrink = (0x7fffffff / 577.0 / (image.shape[0] * image.shape[1])) ** 0.5 # the max size of input image for caffe

I am confused with the number 577.0. From the comment of the above line of code, https://github.com/sfzhang15/SFD/issues/21#issue-306633279 and https://github.com/BVLC/caffe/issues/3084#issue-107192046, I found it could be related to the code from caffe. However, the comments I found still cannot explain why 577.0 is used. Could you explain why 577.0?

I also found this metric is used in recent face detection models even if the framework is not Caffe:

  1. https://github.com/TencentYoutuResearch/FaceDetection-DSFD/blob/0ac2b78ebed58ede3239b55a8634e71ff4203770/widerface_val.py#L282 (200.0 is used instead of 577.0)
  2. https://github.com/PaddlePaddle/models/blob/554d8864eb0ec9416c06357a7bca1789fc750f78/PaddleCV/face_detection/widerface_eval.py#L280
  3. https://github.com/liuwei16/CSP/blob/785bc4c5f956860116d8d51754fd76202afe4bcb/test_wider_ms.py#L144
sfzhang15 commented 5 years ago

@fengyuentau The 577.0 is experimented in CAFFE and the other framework may not have the same limitation. During using PyTorch, we use anthor limitation from the GPU memory.

fengyuentau commented 5 years ago

I thought the number might be related to the model structure. Anyway, thanks for your answer.