matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.63k stars 11.7k forks source link

How to achieve multiple scales augmentation for maskrcnn #2266

Open lyw615 opened 4 years ago

lyw615 commented 4 years ago

Multi-scale scaling of the input data results in more sized data, similar to the image pyramid. For example resize my image from [512512] to [324324], [1024*1024], more date is created. And fpn fuse multi-level features with faster speed, due to calculating on the feature map. So whether multi-scale data enhancement method is replaced by fpn, and there is no need using the data augmentation method for maskrcnn?
In addition, dose maskrcnn require the input size of image fixed?

konstantin-frolov commented 4 years ago

You can use imgaug.augmenters.Affine(scale=(min, max))

lyw615 commented 4 years ago

You can use imgaug.augmenters.Affine(scale=(min, max))

But in the utils.py , image will be resized to [ max_dim max_dim]. Which causes the image in the same scale. Just like scale up image from [512512] to [10241024] ,but the image with size [10241024] will be resize to [512*512] again. Followings codes are in line 450 of utils.py

if mode == "square":
          # Get new height and width
          h, w = image.shape[:2]
          top_pad = (max_dim - h) // 2
          bottom_pad = max_dim - h - top_padl
          left_pad = (max_dim - w) // 2
          right_pad = max_dim - w - left_pad
          padding = [(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
          image = np.pad(image, padding, mode='constant', constant_values=0)
          window = (top_pad, left_pad, h + top_pad, w + left_pad)
konstantin-frolov commented 4 years ago

You talk about augmentation, for augmentation "on fly" use imgaug.augmenters.Affine(scale=(min, max)) MRCNN simply resizes images because the network input is fixed to the image size.