Open lyw615 opened 4 years ago
You can use imgaug.augmenters.Affine(scale=(min, max))
You can use imgaug.augmenters.Affine(scale=(min, max))
But in the utils.py , image will be resized to [ max_dim max_dim]. Which causes the image in the same scale. Just like scale up image from [512512] to [10241024] ,but the image with size [10241024] will be resize to [512*512] again. Followings codes are in line 450 of utils.py
if mode == "square":
# Get new height and width
h, w = image.shape[:2]
top_pad = (max_dim - h) // 2
bottom_pad = max_dim - h - top_padl
left_pad = (max_dim - w) // 2
right_pad = max_dim - w - left_pad
padding = [(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
image = np.pad(image, padding, mode='constant', constant_values=0)
window = (top_pad, left_pad, h + top_pad, w + left_pad)
You talk about augmentation, for augmentation "on fly" use imgaug.augmenters.Affine(scale=(min, max)) MRCNN simply resizes images because the network input is fixed to the image size.
Multi-scale scaling of the input data results in more sized data, similar to the image pyramid. For example resize my image from [512512] to [324324], [1024*1024], more date is created. And fpn fuse multi-level features with faster speed, due to calculating on the feature map. So whether multi-scale data enhancement method is replaced by fpn, and there is no need using the data augmentation method for maskrcnn?
In addition, dose maskrcnn require the input size of image fixed?