mdbloice / Augmentor

Image augmentation library in Python for machine learning.
https://augmentor.readthedocs.io/en/stable
MIT License
5.07k stars 866 forks source link

Multiple mask augmentation: semi non-identical augmentations? #158

Closed maartenterpstra closed 5 years ago

maartenterpstra commented 5 years ago

So I'm trying to use the Augmentor Datapipeline for my dataset. My dataset consists of two images and a corresponding vector field. Now I can do the same augmentation for all three samples at the same time and that's working great. But for the vector field some extra work needs to be done.

For example, suppose I do a random vertical flip. For the images this is no problem, but for the vector field a little extra work needs to be done. After a vertical flip, the y-component of the vector field needs to be rotated 180 degrees in order to remain consistent.

Can I make functions that get executed on part of a sample based on the random choice of the augmentation operations?

And as a next step, can I also perform some functions only on the images and not on the vector field?

mdbloice commented 5 years ago

Hi @mlterpstra-umc, thanks for opening the issue and sorry for the delay, work responsibilities have kept me away from this project for several weeks now.

So, just to clarify what you meant - you are passing a number of images, but you are also passing a vector field that you are basically treating as an image?

Unfortunately Augmentor doesn't support vector fields, so you will get inconsistent results if you use it in the way I think you are describing.

As for whether you can apply functions to only a subset of the images passed to each operation, no that is not possible as I couldn't really think of a reason why you might want to do that. If Augmentor did support vector fields, then I suppose that would make for a use-case but until now I hadn't thought of that.

Sorry, I can't think of a way to make Augmentor do what you currently require!

M.

maartenterpstra commented 5 years ago

Hi @mdbloice

No worries, thanks for the reply. And of course thank you for this project, it's been a breeze to work with 😊.

So yeah, indeed I'm passing in different images that mean different things, e.g. images or vector fields.

I also found out that Augmentor doesn't support vector fields but I made some modifications in my own fork so that I can work with it.

Basically I wrap my data in images and extended the pipeline to accept one operation per 'image class'. For example, what I want is to pass into Augmentor two images and the vector field between them. So that means there are four different 'image classes', i.e. image1, image2, vector_x, vector_y. I currently made something that can do this:

p = Augmentor.MultiOpDataPipeline()

image_ops = [
    Augmentor.Operations.Flip(0.5, 'LEFT_RIGHT'),
    Augmentor.Operations.Flip(0.5, 'TOP_BOTTOM'),
    Augmentor.Operations.RandomContrast(0.15, 0.8, 1),
    Augmentor.Operations.RandomColor(0.15, 0.7, 1.3),
    Augmentor.Operations.RandomBrightness(0.3, 0.8, 1.2),
    Augmentor.Operations.Zoom(0.3, 0.5, 1.5)
]
x_flow_ops = [
    XFlowFlip.XFlowFlip(0.5),
    Augmentor.Operations.Flip(0.5, 'TOP_BOTTOM'),
    None,
    None,
    None,
    ZoomFlowField.ZoomFlowField(0.3, 0.5, 1.5)
]

y_flow_ops = [
    Augmentor.Operations.Flip(0.5, 'LEFT_RIGHT'),
    YFlowFlip.YFlowFlip(0.5),
    None,
    None,
    None,
    ZoomFlowField.ZoomFlowField(0.3, 0.5, 1.5)
]

ops = list(zip(image_ops, image_ops, x_flow_ops, y_flow_ops))
p.add_operations(ops)

Every row in the operations matrix applies that operation on its image class. It determines for the first if it should be applied (this implies that the probability is constant for every image class, which is to be expected in my case) and applies the according operation consistently.

This way I can deal with different types of images, e.g. images and vectors. Left/Right flip is fine for vectors in the y-direction, but for vectors in the x-direction you need to do some extra work, so that gets a different operation. RandomContrast makes sense for images, but you can't do that for vector fields so that gets a None operation. For zooming it's the same, for vector fields a little bit of extra work needs to be done, so that gets a different operation. But we need to make sure that the same zoom factor is applied so I made sure the random state is the same before each operation such the same choice for the same random parameter is made each time.

You can check out the code in my fork (https://github.com/mlterpstra-umc/Augmentor/blob/master/Augmentor/Pipeline.py). It's not pretty yet with some copy-and-pasting but it currently does the job.

Let me know what you think of this and whether you also want this in Augmentor.

mdbloice commented 5 years ago

@mlterpstra-umc OK that sounds pretty interesting what you've done, I will take a look at the code and maybe we can think about merging it in. I wonder would there be much interest in such a feature, or is what you are doing very niche do you think? M.

maartenterpstra commented 5 years ago

Hey @mdbloice, thanks for your interest!

I think the way I used it here is pretty niche, but I'm currently also using it for image-to-image tasks rather than image classification and I think it can also be useful for that case.

For example, suppose you want to do image restoration. You'll still want to do flips and zooms to both input and target image, but apply your noise or corruption only to the input.

It might also be useful in the scenario you have sketched in the Readme.md, where you do image segmentation with a mask. You might want to apply extra augmentation to the images you don't want to do at the mask (because you want to make it more robust, for example) whereas some operations you must do on both.