longcw / faster_rcnn_pytorch

Faster RCNN with PyTorch
MIT License
1.71k stars 466 forks source link

How to extract feature from a given ROI? #39

Open dichen-cd opened 7 years ago

dichen-cd commented 7 years ago

Hi there, I got some problem extracting feature from a given roi. The code I wrote is

    def _im_exfeat(self, image, roi):
        """
        image: ( ndarray ) (H x W x 3 )
        roi: (ndarray) (1 x 4) [x1, y1, x2, y2] 
        """
        im_data, im_scales = self.get_image_blob(image)
        roi = np.hstack([np.zeros((1, 1)), roi]) 
        roi = network.np_to_variable(roi, is_cuda=True) * im_scales[0]

        im_data = network.np_to_variable(im_data, is_cuda=True)
        im_data = im_data.permute(0, 3, 1, 2)
        features = self.rpn.features(im_data)
        pooled_features = self.roi_pool(features, roi)

        x = pooled_features.view(pooled_features.size()[0], -1)
        x = self.fc6(x)
        x = self.fc7(x)

        return x

It is a method inside the FasterRCNN class. What I'm not sure about is the given roi. There're three choices of the different version:

  1. The original roi, corresponding to the original image
  2. The rescaled roi, corresponding to the resized input image (as shown in the code)
  3. The projected roi, corresponding to the feature map of vgg_conv4, whose stride is 16.

Since there's no detailed comment in the roi-pooling-related code, I'm not sure which one to use. Hope you could give me some hint.

Thank you.