Closed rxqy closed 6 years ago
@rxqy Using dilated conv can enlarge the receptive field, that's true. But in this code, the FPN_FEAT_STRIDES
is just used for generate the anchors in the feature map. you can see the 72 line in proposal_layer_fpn.py. FPN_FEAT_STRIDES
just indicates the scale(not the receptive field) between original image and featuremap. Hope it can help you.
@guoruoqian Thx! Problem solved.
In your implementation, the feat stride is set to [4, 8, 16, 16, 16] in line 187 of trainval.py
FPN_FEAT_STRIDES', '[4, 8, 16, 16, 16]
it worked, and it's indeed the case in the original paper, because the feature maps are downsampled 4x, 8x, 16x, 16x, 16x (only dilated, no pooling).However, from my understanding, as we are using dilated conv, the receptive field of the 3rd, 4 and 5 th layers are enlarged, so I think the stride should be set to
[4, 8, 16, 32, 64]
ie, one point in the 4th layer represents 2x more information than the 3rd layer. (so 32x?)Maybe I'm not understanding dilated conv in the right way..? Correct me if I'm wrong.