Open jwwangchn opened 6 years ago
There are functions in the library to augment images, segmentation maps and bounding boxes. You will have to convert the segmentation maps to imgaug.SegmentationMapOnImage
and the bounding boxes to imgaug.BoundingBoxesOnImage
.
Rotation can be done via Affine(rotate=(lower bound in degrees, upper bound in degrees))
. You have to call to_deterministic()
on the augmenter before augmenting images, segmentation maps and bounding boxes to achieve the same rotation (in degrees) over all input types per image.
The rough outline of the training loop is something like:
rot = iaa.Affine(rotate=(-30, 30))
for batch in batches:
rot_det = rot.to_deterministic()
images = batch.images # should return e.g. a list of uint8 numpy arrays of shape (H, W, C)
segmaps = batch.segmaps # should return a list of imgaug.SegmentationMapOnImage
bbs = batch.bbs # should return a list of imgaug.BoundingBoxesOnImage
images_aug = rot_det.augment_images(images)
segmaps_aug = rot_det.augment_segmentation_maps(segmaps)
bbs_aug = rot_det.augment_bounding_boxes(bbs)
# train command here
@aleju Thanks for your answer, but In COCO dataset, the instance segmentation mask is polygon format which uses some points which are located on mask boundary to represent the mask rather than segmentation map. I think this format mask is similar with the bounding box which also uses four points to represent. So I want to know how to rotate this format mask?
The library does not yet support direct augmentation of polygons, as that is fairly hard to implement.
You can transform the polygon to a list of imgaug.Keypoint
objects, then use augmenter.augment_keypoints()
to transform these and at the end recreate your polygon (that's how bounding box augmentation works). That will work with e.g. Affine and I think PerspectiveTransform (and obviously with all augmenters not affecting the geometry of images). But it might not work for PiecewiseAffine or ElasticTransformation, because the new locations of keypoints might be such that you cannot create a concave polygon from them (i.e. a polygon that does not intersect itself).
The alternative to the keypoint-based augmentation is to convert the polygons to segmentation maps or masks. You can convert the coco polygons to imgaug.Polygon([(x, y), (x, y), ...])
and then use that object's Polygon.draw_on_image(np.zeros((*image.shape[0:2], 3), dtype=np.uint8), color=[255, 255, 255], color_perimeter=[255, 255, 255])
to create a mask for that polygon with the same size as an image given by image
. Then you can create a segmentation map object via imgaug.SegmentationMapOnImage(mask / 255.0, shape=image.shape)
. The segmentation map can then be augmented via augmenter.augment_segmentation_maps(segmaps)
. Recreating a polygon from that augmented mask wouldn't be completely trivial, but depending on how your model works you might be operating on a mask/map level anyways and hence not even need that.
@aleju how about voc dataset, with only boxes
Do you mean bounding boxes? In that case, see here for documentation on that.
I've created a script to augment and rotate both the images and their masks; is that what you're looking for?
If so, my directories are structures as: image_name/images/image_name.png image_name/masks/mask_name.png
This script creates three new rotations and three new augmentations for each input image. There are up to three possible augmentations per output image, as more than this has been shown to introduce noise. For rotations, masks are adjusted accordingly, for augmentations, the masks are left unmodified. You can easily adjust as needed.
I have not yet parallelized or optimized, but happy to hear suggestions.
It should be easy to modify the directory structure to match your own with the following code:
from imgaug import augmenters as iaa
import numpy as np
import imageio
import os
import argparse
# To Run:
# python3.6 augment.py -id /path/to/img/folder -od /path_to_output_dir > ./augmentation.log &
#
parser = argparse.ArgumentParser(description='Create an augmented data set.')
parser.add_argument('-id', '--input_dir', type=str, help='Base directory of the images to be augmented.', required=True)
parser.add_argument('-od', '--output_dir', type=str, help='Base directory of the output.', required=True)
args = parser.parse_args()
# store images and related masks as dictionary
img_dict = {}
for image_name in os.listdir(args.input_dir):
# get abs path to image
img = "{}.png".format(os.path.join(os.path.abspath(args.input_dir), image_name, "images", image_name))
# create list of mask files for image
mask_file_list = []
mask_folder = os.path.join(os.path.abspath(args.input_dir), image_name, "masks")
for mask in os.listdir(mask_folder):
mask_path = os.path.join(os.path.abspath(mask_folder), mask)
mask_file_list.append(mask_path)
img_dict[image_name] = {
"image":img,
"masks":mask_file_list
}
# create the augmentation sequences
rotators = iaa.SomeOf((1,3), [
iaa.Fliplr(1), # flip horiozontally
iaa.Flipud(1), # Flip/mirror input images vertically
iaa.Rot90(1)
], random_order=True)
transformers = iaa.SomeOf((1, 3), [
iaa.Superpixels(p_replace=0.5, n_segments=64), # create superpixel representation
iaa.GaussianBlur(sigma=(0.0, 5.0)),
iaa.AverageBlur(k=(2, 7)), # blur image using local means with kernel sizes between 2 and 7
iaa.MedianBlur(k=(3, 11)), # blur image using local medians with kernel sizes between 2 and 7
iaa.ElasticTransformation(alpha=(0, 5.0), sigma=0.25), # distort pixels
# either change the brightness of the whole image (sometimes
# per channel) or change the brightness of subareas
iaa.OneOf([
iaa.Multiply((0.5, 1.5), per_channel=0.5),
iaa.FrequencyNoiseAlpha(
exponent=(-4, 0),
first=iaa.Multiply((0.5, 1.5), per_channel=True),
second=iaa.ContrastNormalization((0.5, 2.0))
)
])
], random_order=True)
# go through all the images and create a set of augmented images and masks for each
for key, val in img_dict.items():
img = val["image"]
masks = val["masks"]
base_masks = []
# read in images and masks
base_image = np.array(imageio.imread(img).astype(np.uint8))
for mask in masks:
out_mask = imageio.imread(mask).astype(np.uint8)
base_masks.append(out_mask)
# loop through and rotate each image three times
counter = 0
while counter < 4:
# create transforms to image and masks
rotator = rotators.to_deterministic() # set same random augmentation for img and masks
images_aug = rotator.augment_image(base_image) # rotate image
masks_aug = rotator.augment_images(base_masks) # rotate matching masks
# update counter
counter += 1
# create output directories
output_dir = "{}/{}_rot{}".format(args.output_dir, key, counter)
out_img_dir = "{}/{}".format(output_dir, "images")
out_mask_dir = "{}/{}".format(output_dir, "masks")
# Create results directories
if not os.path.exists(output_dir):
os.makedirs(output_dir)
if not os.path.exists(out_img_dir):
os.makedirs(out_img_dir)
if not os.path.exists(out_mask_dir):
os.makedirs(out_mask_dir)
# save augmented image
img_out = "{}/{}_rot{}.png".format(out_img_dir, key, counter)
imageio.imwrite(img_out, images_aug)
for i, mask in enumerate(masks_aug):
mask_out = "{}/{}_rot{}_{}.png".format(out_mask_dir, key, counter, i)
imageio.imwrite(mask_out, masks_aug[i])
# loop through and augment each image three times, leaving masks untouched
counter2 = 0
while counter2 < 4:
# create transforms to image and masks
augmentor = transformers.to_deterministic() # set same random augmentation for img
images_aug = augmentor.augment_image(base_image)
# update counter
counter2 += 1
# create output directories
output_dir2 = "{}/{}_aug{}".format(args.output_dir, key, counter2)
out_img_dir2 = "{}/{}".format(output_dir2, "images")
out_mask_dir2 = "{}/{}".format(output_dir2, "masks")
# Create results directories
if not os.path.exists(output_dir2):
os.makedirs(output_dir2)
if not os.path.exists(out_img_dir2):
os.makedirs(out_img_dir2)
if not os.path.exists(out_mask_dir2):
os.makedirs(out_mask_dir2)
# save augmented image
img_out = "{}/{}_aug{}.png".format(out_img_dir2, key, counter2)
imageio.imwrite(img_out, images_aug)
for i, mask in enumerate(base_masks):
mask_out = "{}/{}_aug{}_{}.png".format(out_mask_dir2, key, counter2, i)
imageio.imwrite(mask_out, base_masks[i])
Hey @summerela, is there a reason why you do not apply rotations+transforms in one go:
2) Do you have a reference for the introduction of noise for more than 3 augmentations?
Hello there!
I do the rotations and transforms in separate steps because I want to rotate the masks along with the images, but I do not want to appy tranformations to the original masks. I'm sure I could have done this more elegently. I have been looking into whether or not I should still do on-the-fly augmentation during training to assist with prevention of overfitting.
I should clarify; no more than 3 augmentations per single transform of an image. Three is a rule of thumb that I have seen in numerous articles and software documentation. The basic premise is not to change the original image so much that it no longer represents what you're trying to classify. For example, if you scramble a picture of a dog so much that it looks just like background, you're actually going to hamper your model's ability to distinguish between a dog and background.
COCO dataset labels object by bounding box and instance segmentation mask. I want to do rotation argumentation for both image, bounding box and instance segmentation. So How to do these. Thanks.