Hi, I have a question that confuses me. I use resnet50-fpn backbone

The image will be scaled before sending to network, but I found the bounding box coordinates were not scaled? why? and how about segmentation coordinates?

in loader.py RoiDataLoader--get_minibatch---add_rpn_blobs Basically in codes are: def get_minibatch(roidb): """Given a roidb, construct a minibatch sampled from it."""

We collect blobs from each image onto a list and then concat them into a

# single tensor, hence we initialize each blob to an empty list
blobs = {k: [] for k in get_minibatch_blob_names()}

# Get the input image blob
im_blob, im_scales = _get_image_blob(roidb)
blobs['data'] = im_blob

print('in get minibatch..blobs_data', blobs['data'])

if cfg.RPN.RPN_ON:
    # RPN-only or end-to-end Faster/Mask R-CNN

valid = roi_data.rpn.add_rpn_blobs(blobs, im_scales, roidb)

elif cfg.RETINANET.RETINANET_ON: raise NotImplementedError else:

Fast R-CNN like models trained on precomputed proposals
```
valid = roi_data.fast_rcnn.add_fast_rcnn_blobs(blobs, im_scales, roidb)
```
print('blobs in get minibatch',blobs) return blobs, valid def add_rpn_blobs(blobs, im_scales, roidb): """Add blobs needed training RPN-only and end-to-end Faster R-CNN models.""" if cfg.FPN.FPN_ON and cfg.FPN.MULTILEVEL_RPN:

RPN applied to many feature levels, as in the FPN paper
```
k_max = cfg.FPN.RPN_MAX_LEVEL#6
k_min = cfg.FPN.RPN_MIN_LEVEL#2
foas = []
for lvl in range(k_min, k_max + 1):#2 3 4 5 6
    field_stride = 2.**lvl #4 8 16 32 64
    anchor_sizes = (cfg.FPN.RPN_ANCHOR_START_SIZE * 2.**(lvl - k_min), )#32 64 128 256 512
    anchor_aspect_ratios = cfg.FPN.RPN_ASPECT_RATIOS# 0.5 1 2
    print('in add rpn blobs..')
    foa = data_utils.get_field_of_anchors(
        field_stride, anchor_sizes, anchor_aspect_ratios
    )
   # print('foa:',foa)
    foas.append(foa)
all_anchors = np.concatenate([f.field_of_anchors for f in foas])
```
else: foa = data_utils.get_field_of_anchors(cfg.RPN.STRIDE, cfg.RPN.SIZES, cfg.RPN.ASPECT_RATIOS) all_anchors = foa.field_of_anchors

for im_i, entry in enumerate(roidb): scale = im_scales[im_i] im_height = np.round(entry['height'] scale) im_width = np.round(entry['width'] scale) gt_inds = np.where( (entry['gt_classes'] > 0) & (entry['is_crowd'] == 0) )[0]

gt_rois = entry['boxes'][gt_inds, :] #?????????

# TODO(rbg): gt_boxes is poorly named;
# should be something like 'gt_rois_info'
gt_boxes = blob_utils.zeros((len(gt_inds), 6))
gt_boxes[:, 0] = im_i  # batch inds
gt_boxes[:, 1:5] = gt_rois
gt_boxes[:, 5] = entry['gt_classes'][gt_inds]
im_info = np.array([[im_height, im_width, scale]], dtype=np.float32)
blobs['im_info'].append(im_info)

# Add RPN targets
if cfg.FPN.FPN_ON and cfg.FPN.MULTILEVEL_RPN:
    # RPN applied to many feature levels, as in the FPN paper
    rpn_blobs = _get_rpn_blobs(
        im_height, im_width, foas, all_anchors, gt_rois
    )
    for i, lvl in enumerate(range(k_min, k_max + 1)):
        for k, v in rpn_blobs[i].items():
            blobs[k + '_fpn' + str(lvl)].append(v)
else:
    # Classical RPN, applied to a single feature level
    rpn_blobs = _get_rpn_blobs(
        im_height, im_width, [foa], all_anchors, gt_rois
    )
    for k, v in rpn_blobs.items():
        blobs[k].append(v)

for k, v in blobs.items(): if isinstance(v, list) and len(v) > 0: blobs[k] = np.concatenate(v)

valid_keys = [ 'has_visible_keypoints', 'boxes', 'segms', 'seg_areas', 'gt_classes', 'gt_overlaps', 'is_crowd', 'box_to_gt_ind_map', 'gt_keypoints' ] minimalroidb = [{} for in range(len(roidb))] for i, e in enumerate(roidb): for k in valid_keys: if k in e: minimal_roidb[i][k] = e[k]

blobs['roidb'] = blob_utils.serialize(minimal_roidb)

blobs['roidb'] = minimal_roidb

Always return valid=True, since RPN minibatches are valid by design

return True

I found bbox coordinates were not changed after scaled

don't know why bbox coordinates were not scaled

Actual results

im_info contains width ,height ,im_scale but boxes coordinate is beyond image

'im_info': array([[ 1.33300000e+03, 1.00000000e+03, 4.16666657e-01]], dtype=float32), 'roidb': [{'has_visible_keypoints': False, 'boxes': array([[ 1276., 1212., 1911., 1387.], [ 1292., 776., 1933., 994.], [ 1290., 996., 1939., 1195.]], dtype=float32), 'segms': [[[1277, 1213, 1912, 1213, 1912, 1388, 1277, 1388]], [[1293, 823, 1934, 777, 1932, 956, 1297, 995]], [[1291, 1025, 1940, 997, 1936, 1172, 1293, 1196]]], 'seg_areas': array([ 111125. , 112011.5, 111732. ], dtype=float32),`

System information

Operating system: linux
CUDA version: 5
cuDNN version: 6
GPU models (for all devices if they are not all the same): 2
python version: 3.6
pytorch version: 0.4

roytseng-tw / Detectron.pytorch

why boxes coordinates were not scaled in blobs? #176