ROI pooling layer only supports the case that batch size equals to 1?

zfchenUnique commented 7 years ago

From the source code(roi_pooling_cuda.c) and my naive experiments, it seems that the RoI pooling layer only support batch size equals to one. Does anyone know why?

longcw commented 7 years ago

It supports batch_size > 1. You can comment the if statement in roi_pooling_cuda.c and rebuild it.

brisker commented 6 years ago

@longcw @JeffCHEN2017 does this project support batch_size larger than 1？

zfchenUnique commented 6 years ago

Yes, it supports as long as you comment the codes in row_pooling_cuda.c. Please read the code for details. And I have been using it for a while and so far so good.

samet-akcay commented 6 years ago

@JeffCHEN2017 , @longcw, can you please elaborate how you managed to train with multiple batch size? As you guys suggested, I re-ran roi_pooling_cuda.c by commenting out the relevant lines. Then, I changed IMS_PER_BATCH: 4 in experiments/cfgs/faster_rcnn_end2end.yml. When I start to train I got the following:

File "train.py", line 115, in <module>
    blobs = data_layer.forward()
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/roi_data_layer/layer.py", line 74, in forward
    blobs = self._get_next_minibatch()
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/roi_data_layer/layer.py", line 70, in _get_next_minibatch
    return get_minibatch(minibatch_db, self._num_classes)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/roi_data_layer/minibatch.py", line 39, in get_minibatch
    assert len(im_scales) == 1, "Single batch only"
AssertionError: Single batch only

Then, I commented out relevant assert lines, and finally got another error:

File "train.py", line 123, in <module>
    net(im_data, im_info, gt_boxes, gt_ishard, dontcare_areas)
  File "/home/sam/.virtualenvs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/faster_rcnn.py", line 215, in forward
    features, rois = self.rpn(im_data, im_info, gt_boxes, gt_ishard, dontcare_areas)
  File "/home/sam/.virtualenvs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/faster_rcnn.py", line 71, in forward
    cfg_key, self._feat_stride, self.anchor_scales)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/faster_rcnn.py", line 122, in proposal_layer
    x = proposal_layer_py(rpn_cls_prob_reshape, rpn_bbox_pred, im_info, cfg_key, _feat_stride, anchor_scales)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/rpn_msr/proposal_layer.py", line 131, in proposal_layer
    proposals = bbox_transform_inv(anchors, bbox_deltas)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/fast_rcnn/bbox_transform.py", line 59, in bbox_transform_inv
    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
ValueError: operands could not be broadcast together with shapes (74592,1) (18648,1)

Seems like it can't load boxes? Have you experiences anything likes this? If so, how did you manage to solve it?

lopuhin commented 6 years ago

FWIW, there is also a subtle bug in the cuda backwards code for roi pooling that manifests itself only when batch size is > 1:

This line https://github.com/longcw/faster_rcnn_pytorch/blob/4fda7a4b89cf71fc3905bd484b1dc82dbc6150d1/faster_rcnn/roi_pooling/src/cuda/roi_pooling_kernel.cu#L170 should end with == (c * height + h) * width + w instead of == index, or else gradients will be propagated only into the first element of the batch. Discovered this issue while using roi pooling layer for another project.

longcw / faster_rcnn_pytorch

ROI pooling layer only supports the case that batch size equals to 1? #52