matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.52k stars 11.68k forks source link

Multiple image augmentation for training dataset #768

Open ronykalfarisi opened 6 years ago

ronykalfarisi commented 6 years ago

Dear All and @waleedka , I've been using this repo to detect and create masking for crack damage on bridge structures. My training dataset has 850 images and overall I got decent result. As you can see in two images below, the model can detect and segment horizontal and vertical crack well. But it fails in detecting diagonal crack. So, I'm thinking that I could solve this problem if I have more images. 5 0 I noticed that we could use image augmentation in the training stage using the keyword option augmentation as follow

model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=30,
            layers='all', 
            augmentation = imgaug.augmenters.Sequential([ 
                imgaug.augmenters.Fliplr(1), 
                imgaug.augmenters.Flipud(1), 
                imgaug.augmenters.Affine(rotate=(-45, 45)), 
                imgaug.augmenters.Affine(rotate=(-90, 90)), 
                imgaug.augmenters.Affine(scale=(0.5, 1.5))]))

However, from what I understand, this augmentation is applied consecutively to each image. In other words, for each image, the augmentation apply flip LR, and then followed by flip UD, then followed by rotation of -45 and 45, then followed by another rotation of -90 and 90, and lastly followed by scaling with factor 0.5 and 1.5.

So, my question is, Is there a way to apply each augmentation separately for each image? What I meant by this is, I want each augmentation to generate one extra data (and mask) alongside with the original. If this can be achieved, the augmentation will generate 6x total images when I apply 5 image augmentation making the whole dataset contains 5100 images.

Thank you and I really appreciate the helps.

zungam commented 6 years ago

Is there a way to apply each augmentation separately for each image?

Dont use .Sequential([ as it means "do all these augmentations after each other".

I want each augmentation to generate one extra data (and mask) alongside with the original.

I couldnt personally get around the problem so I made sure the database was used six times, that is in practice: epoch = at least set to six times size of database. But more could be done. shuffle = False and made augmentation 5/6 probability of happening. It does not create exactly 5 extra copys for each image, but statistically it would do the same over a large database.

imgaug.Sometimes(5/6,aug.OneOf(
                                            [
                                            imgaug.augmenters.Fliplr(1), 
                                            imgaug.augmenters.Flipud(1), 
                                            imgaug.augmenters.Affine(rotate=(-45, 45)), 
                                            imgaug.augmenters.Affine(rotate=(-90, 90)), 
                                            imgaug.augmenters.Affine(scale=(0.5, 1.5))
                                             ]
                                        )
                                   )

Note! Whats good about this is also that the effect of augmentations is random, which creates a more safe gradient descent. Training 6 times in a row on the same image, I believe would make your training slower as the gradient descent take bigger leaps in the direction of each batch, thus zig zaging down to local minima.

ronykalfarisi commented 6 years ago

Dear Magnus (@zungam ), thank you so much for your suggestion. I'll give it a try and I'll let you know about the result when it's done. However, there are couple of things I'd like to clarify from you.

  1. As I understand, the definition of one epoch is one forward and backward pass through all the images in the dataset. So, when you said "epoch = at least set to six times size of database.", does it mean epoch = 6*850 (Since I have 850 images in the dataset)?

  2. You said that "Training 6 times in a row on the same image, would make your training slower as the gradient descent take bigger leaps". While this is true, but this is not my intention. What I really want is, for example, image1 + flipLR(image1) + flipUD(image1) + rotate45(image1) + rotate90(image1) + scale(image1). I believe, all those 6 images are not really the same.

It would be nice if we can achieve this easily, for example, by feeding list of augmentation in the augmentation keyword when we call the train method.

zungam commented 6 years ago
  1. Yes, this is what i ment.
  2. They are not the same, but so similar that I believe they will push the gradient in the similar direction. Using flipLR and rotate would not do this however, so you are right.

If really want to try this. You could go into load_img_gt and data_generator and make them repeat the same image 6 times in a row. You have to be creative to achieve this. Perhaps something like:

initalize state = 0 in while loop of data_generator

state += 1
if state = 6:
      state = 0
      image_index =  (image_index + 1) % len(image_ids)

instead of

image_index = (image_index + 1) % len(image_ids)

Then you can define 6 types of augmentation in load_img_gt, and make it switch to what augmentation it uses as a function of that state it is in (0,1,2,3,4 or 5).

ronykalfarisi commented 6 years ago

Thanks @zungam, I'll give this a try.

waspinator commented 6 years ago

@ronykalfarisi I'm facing a similar issue trying to train on plant stems. Did you manage to find a solution to diagonal objects?

The difference in my dataset is that all the stems are already diagonal in my training dataset, but it still can't detect them. I have a feeling that maybe it's because the detection box area is much larger than the actual object inside.

ronykalfarisi commented 6 years ago

Hi @waspinator, I haven't found an effective solution for thin diagonal objects yet. However, I got relatively better result when I trained the network with more mask loss weight. I suggest you get some of horizontal and vertical objects as well in your training dataset to make it more general.

fastlater commented 6 years ago

@ronykalfarisi how many cracks do you have in your dataset? less than 2000? Curious how you get so high accuracy with 850 images only. I am always working with little data (original + augmentation) and it is hard for me to get good results. Any advise will be appreciate.

ronykalfarisi commented 6 years ago

Hi @fastlater , My dataset is only 893 (850 for training, 43 for testing). I used resnet101 and I increased the mask weight to 10 since I have unbalance dataset. Hope this can help, in case you need more help, you need to describe your problem first

fastlater commented 6 years ago

@ronykalfarisi Well, I have only 75 images (65 for training + 5 for val + 5 for testing). From those 65 training images, I only have 1000 and I though maybe because of that I was not getting good results. But you are proving the opposite with your little data. Do you think increase the mask weight will help me in my case too?

ronykalfarisi commented 6 years ago

@fastlater, I believe you need more images, I tried using Faster RCNN before and it worked with 300 images. I haven't tried with less data though

fastlater commented 6 years ago

if I set STEPS_PER_EPOCH larger than my original dataset, does the script generates some extra augmented images to complete the steps? Does model takes some augmented images automatically each epoch?

ronykalfarisi commented 6 years ago

From my experience, increasing steps_per_epoch doesn't affect the performance. So, it's better to leave it at default (100 in my case). If you use augmentation option, it depends on the probability. You can modify the script how often it does the augmentation, but I think the default is 0.5 or in other words, it only augment your dataset by half

patrickcgray commented 5 years ago

Hey @ronykalfarisi three questions for you if you have a moment:

Thanks!!

ronykalfarisi commented 5 years ago

@patrickcgray,

  1. what change?
  2. Actually, I didn't do what zungam suggested seems it sounds very weird to me. I used tensorflow implementation instead.
  3. It's imbalance data sets since the number of pixel inside the ground truth bounding box is much less than the background pixels. Yes, that's exactly what I did. It helped because we give more penalty to wrong pixel when they make mistakes. In other words, we tell the algorithm to stress more on the mask loss
patrickcgray commented 5 years ago

@ronykalfarisi

  1. I meant the augmentation code that @zungam suggested. I was curious if that improved your model.
  2. Okay that makes sense. I wish I knew what he meant. I assume he meant steps_per_epoch
  3. Okay thanks for the clarification, I think I need to do the same.
xDzai94 commented 5 years ago

Dear All and @waleedka , I've been using this repo to detect and create masking for crack damage on bridge structures. My training dataset has 850 images and overall I got decent result. As you can see in two images below, the model can detect and segment horizontal and vertical crack well. But it fails in detecting diagonal crack. So, I'm thinking that I could solve this problem if I have more images. 5 0 I noticed that we could use image augmentation in the training stage using the keyword option augmentation as follow

model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=30,
            layers='all', 
            augmentation = imgaug.augmenters.Sequential([ 
                imgaug.augmenters.Fliplr(1), 
                imgaug.augmenters.Flipud(1), 
                imgaug.augmenters.Affine(rotate=(-45, 45)), 
                imgaug.augmenters.Affine(rotate=(-90, 90)), 
                imgaug.augmenters.Affine(scale=(0.5, 1.5))]))

However, from what I understand, this augmentation is applied consecutively to each image. In other words, for each image, the augmentation apply flip LR, and then followed by flip UD, then followed by rotation of -45 and 45, then followed by another rotation of -90 and 90, and lastly followed by scaling with factor 0.5 and 1.5.

So, my question is, Is there a way to apply each augmentation separately for each image? What I meant by this is, I want each augmentation to generate one extra data (and mask) alongside with the original. If this can be achieved, the augmentation will generate 6x total images when I apply 5 image augmentation making the whole dataset contains 5100 images.

Thank you and I really appreciate the helps.

Hi, have you found anyways to solve your problem in how to make multiple image augmentation for training dataset? If you have found any, truly appreciate if you can share the solution. Thanks in advance.

soheilsadeghi90 commented 5 years ago

Hi @ronykalfarisi ! I'm working on a dataset with nearly same size as yours and dealing with thin and long masks (horizontal, vertical, and diagonal). I was wondering how you modified the configs to get a good performance. You already mentioned that mrcnn_mask_loss weight is effective. But what about other config elements (unmold_mask threshold, RPN_ANCHOR_RATIOS for example)? I would appreciate it you advise me on that.

ronykalfarisi commented 5 years ago

@soheilsadeghi90 & @xDzai94 , I believe I wrote what I did in the previous comments. I moved toward TensorFlow implementation of Mask-RCNN. They have multiple data augmentations.

jordanvandijk9 commented 5 years ago

@ronykalfarisi What do you exactly mean by "I moved toward TensorFlow implementation of Mask-RCNN"? I believe I am also using the TensorFlow implementation. However, looking at the code it only has one (combination of) data augmentation(s) for each single image. I cannot increase the number of total images by using this function of data augmentation.

Am I just not seeing something (e.g. missing a functionality in the TensorFlow implementation) or is the TensorFlow implementation something totally different than the standard way of using Mask-RCNN?

ronykalfarisi commented 5 years ago

@jordanvandijk9 , I'm sorry, what I meant was "TensorFlow Team" implementation. They have several implementations in their "research" folder and one of them is object detection API.

Amrimn commented 4 years ago

Hi @fastlater , My dataset is only 893 (850 for training, 43 for testing). I used resnet101 and I increased the mask weight to 10 since I have unbalance dataset. Hope this can help, in case you need more help, you need to describe your problem first

Hi @ronykalfarisi , What do you mean by I increased the mask weight to 10 Here is the mask loss i have mask_loss = KL.Lambda(lambda x: mrcnn_mask_loss_graph(*x), name="mrcnn_mask_loss")( [target_mask, target_class_ids, mrcnn_mask])

how can I increase it by10 ?

ronykalfarisi commented 4 years ago

Hi @Amrimn , In config.py inside mrcnn folder, you'll fine something like this

LOSS_WEIGHTS = {
        "rpn_class_loss": 1.,
        "rpn_bbox_loss": 1.,
        "mrcnn_class_loss": 1.,
        "mrcnn_bbox_loss": 1.,
        "mrcnn_mask_loss": 1.
    }

replace the number as you like.

rubeea commented 4 years ago

@soheilsadeghi90 & @xDzai94 , I believe I wrote what I did in the previous comments. I moved toward TensorFlow implementation of Mask-RCNN. They have multiple data augmentations.

Hi, I am researching on a similar problem. Would highly appreciate if you could provide me the link for the TensorFlow Team Implementation of Mask-RCNN which improved accuracy in your case.

Thanks :)

ronykalfarisi commented 4 years ago

@rubeea , this is the one I was using https://github.com/tensorflow/models/tree/master/research/object_detection

kapil0kumar commented 4 years ago

Is there a way to apply each augmentation separately for each image?

Dont use .Sequential([ as it means "do all these augmentations after each other".

I want each augmentation to generate one extra data (and mask) alongside with the original.

I couldnt personally get around the problem so I made sure the database was used six times, that is in practice: epoch = at least set to six times size of database. But more could be done. shuffle = False and made augmentation 5/6 probability of happening. It does not create exactly 5 extra copys for each image, but statistically it would do the same over a large database.

imgaug.Sometimes(5/6,aug.OneOf(
                                            [
                                            imgaug.augmenters.Fliplr(1), 
                                            imgaug.augmenters.Flipud(1), 
                                            imgaug.augmenters.Affine(rotate=(-45, 45)), 
                                            imgaug.augmenters.Affine(rotate=(-90, 90)), 
                                            imgaug.augmenters.Affine(scale=(0.5, 1.5))
                                             ]
                                        )
                                   )

Note! Whats good about this is also that the effect of augmentations is random, which creates a more safe gradient descent. Training 6 times in a row on the same image, I believe would make your training slower as the gradient descent take bigger leaps in the direction of each batch, thus zig zaging down to local minima.

Hi @zungam, can you explain why have to set shuffle=False in here.

1chimaruGin commented 4 years ago

Dear @ronykalfarisi

Can I see your repo for this case?

ronykalfarisi commented 4 years ago

@1chimaruGin , sorry bro I used it for work so I can't show it. However, I used the repo from Tensorflow research team with several modifications.

Altimis commented 4 years ago

@1chimaruGin , sorry bro I used it for work so I can't show it. However, I used the repo from Tensorflow research team with several modifications.

Hi, thank you for suggesting to use the tensorflow team implementation for mask rcnn. Do you confirm that this implementation is more accurate than the matteport one ?

Adithia99 commented 3 years ago

Anybody here , know how to augmentation mask rcnn ? So we get images and masking (label) in our folder not just in traning case ?

rubeea commented 3 years ago

Anybody here , know how to augmentation mask rcnn ? So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Adithia99 commented 3 years ago

Both

Anybody here , know how to augmentation mask rcnn ? So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

rubeea commented 3 years ago

Both

Anybody here , know how to augmentation mask rcnn ? So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

You can use Augmentor python api for augmenting training and validation image data along with the respective masks. Link below: https://github.com/mdbloice/Augmentor

Adithia99 commented 3 years ago

Both

Anybody here , know how to augmentation mask rcnn ? So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

You can use Augmentor python api for augmenting training and validation image data along with the respective masks. Link below: https://github.com/mdbloice/Augmentor

Thanks bro, but i need . Json not just masking like this

rubeea commented 3 years ago

Both

Anybody here , know how to augmentation mask rcnn ? So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

You can use Augmentor python api for augmenting training and validation image data along with the respective masks. Link below: https://github.com/mdbloice/Augmentor

Thanks bro, but i need . Json not just masking like this

If you want to label your own data you can use any appropriate online annotation tool such as https://labelbox.com/ which will give you all the images and respective annotations in json file.

raficabral commented 2 years ago

Is there a way to apply each augmentation separately for each image?

Dont use .Sequential([ as it means "do all these augmentations after each other".

I want each augmentation to generate one extra data (and mask) alongside with the original.

I couldnt personally get around the problem so I made sure the database was used six times, that is in practice: epoch = at least set to six times size of database. But more could be done. shuffle = False and made augmentation 5/6 probability of happening. It does not create exactly 5 extra copys for each image, but statistically it would do the same over a large database.

imgaug.Sometimes(5/6,aug.OneOf(
                                            [
                                            imgaug.augmenters.Fliplr(1), 
                                            imgaug.augmenters.Flipud(1), 
                                            imgaug.augmenters.Affine(rotate=(-45, 45)), 
                                            imgaug.augmenters.Affine(rotate=(-90, 90)), 
                                            imgaug.augmenters.Affine(scale=(0.5, 1.5))
                                             ]
                                        ) 
                                   )

Note! Whats good about this is also that the effect of augmentations is random, which creates a more safe gradient descent. Training 6 times in a row on the same image, I believe would make your training slower as the gradient descent take bigger leaps in the direction of each batch, thus zig zaging down to local minima.

Hey guys. I have a question regardless data augmentation using imgaug If I use this sintaxe @zungam "imgaug.Sometimes(5/6,aug.OneOf", in my interpretation (am I right?), I will be applying to only 5/6 part of my dataset and will be only one transformation applied ("Oneof"). I wanted as @ronykalfarisi apply the five transformation in my whole dataset (5x more data). How can I do that?