knorth55 / chainer-light-head-rcnn

[This project has moved to ChainerCV] Chainer Implementation of Light Head RCNN
MIT License
18 stars 1 forks source link

about the spatial_scale=1/16 in psroi_align #8

Closed niujiaopeng closed 5 years ago

niujiaopeng commented 5 years ago

suppose I have an image after preprocess scale to h=800,w= 1200 we will use feature_stride is 16 because rpn_feature is 50x75 but why in psroi_align still use spatial_scale=1/16, i think should use 1/32, a proposal's is locate in the image of 800x1200 ,if we want to crop in the thin feature whose size is 25x38, i think proposals x1,y1,x2,y2 should divided by 32, but I found you use 16, could you tell me why please?

knorth55 commented 5 years ago

i dont get your point. can you explain more detailed information?

niujiaopeng commented 5 years ago

thank you for your reply, here is your code in psroi_max_align_2d : roi_start_h = bottom_rois[n, 0] spatial_scale roi_start_w = bottom_rois[n, 1] spatial_scale roi_end_h = bottom_rois[n, 2] spatial_scale roi_end_w = bottom_rois[n, 3] spatial_scale here spatial_scale is 1/16. if an image is 800x1600, the block3 (rpn feature)resnet101 is [50x 75](feature stride is 16), the block4 (stage2 feature) of resnet101 is[25x38], (feature stride is 32) so if we want to crop the feature in the layer of [25x38] ,I think the spatial_scale should be 1/32? looking for your reply

knorth55 commented 5 years ago

that is because we use dilate resnet (not original resnet) for block4. this is the same as original implementation https://github.com/zengarden/light_head_rcnn/blob/master/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/network_desp.py#L197-L199

niujiaopeng commented 5 years ago

so the implement likes the DetNet, which in the last block also use dilate conv and keep resolution of feature map and about the pretrain weights, do you just use the pretrain weights of origin resnet, or construct a new net like this ,and train on image net?

knorth55 commented 5 years ago

for pretrained weight, we use original resnet weight, and we update the weight in training.