Closed bowenc0221 closed 6 years ago
ROI_XFORM_SAMPLING_RATIO is the number of points sampled inside each bin, and by default it is 2 (so 2x2 points sampled per bin). ROI_XFORM_SAMPLING_RATIO=0 means an adaptive ratio is used; see: https://github.com/caffe2/caffe2/blob/master/caffe2/operators/roi_align_op.cu#L110
Regarding R-50-C4, it uses the original MSRA ResNet-50 model (and weights), where the stride=2 op in a residual block was put in the first 1x1 layer (instead of the 3x3 layer in most following work). So after producing the 14x14 output, it will be soon subsampled to 7x7 by the following residual block. It is similar to setting a 7x7 RoIAlign and setting the stride of the following residual block as 1.
What's the meaning of C4 here?
Thanks for the great codes. I found some very trivial problems in the configuration file for R-50-C4.
After reading the config files, I found R-50-C4 was trained using most of the Fast RCNN default settings (https://github.com/facebookresearch/Detectron/blob/master/configs/12_2017_baselines/e2e_faster_rcnn_R-50-C4_1x.yaml#L17).
However, the default configuration sets FAST_RCNN.ROI_XFORM_SAMPLING_RATIO to 0 (https://github.com/facebookresearch/Detectron/blob/master/lib/core/config.py#L648). I guess it might be a mistake?
The spatial resolution after ROIAlign in Mask RCNN paper is set to 7x7, but the R-50-C4 config file sets it to 14x14, is this another mistake?