aleju / imgaug

Image augmentation for machine learning experiments.
http://imgaug.readthedocs.io
MIT License
14.33k stars 2.43k forks source link

How to find out used augmenters and parameters #275

Open ajuric opened 5 years ago

ajuric commented 5 years ago

When I do the augmentation, first I create the Sequential object, define some augmenters inside it, and then augment the images. This could be Sequential object:

    seq = iaa.Sequential([
        iaa.SomeOf((0, 2), [
            iaa.Dropout((0.00, 0.01)),
            iaa.Affine(rotate=(-10, 10)),
            iaa.Fliplr(p=1),),
        ], random_order=True),
        iaa.ChannelShuffle(p=0.1)
    ], random_order=False)

So, only at most 2 of 3 augmenters will be chosen between Dropout, Affine or Fliplr and then ChannelShuffle will be used.

I would like to know the following: 1) Is it possible to find out which of those 3 augmenters was chosen for each image augmentation (when calling augment_image method)? 2) Is it possible to find out which parameter was selected for specific augmenter (when calling augment_image method))? Eg. Dropout was defined with the parameter which is uniformly selected from [0, 0.01], so before applying Dropout this parameter has some specific value which is from [0, 0.01].

Short question: Is it possible to find out used augmenters and their parameters when calling augment_image method?

Motivation for this is to find the parameters which create disturbed images when augmenting (images which are "corrupted" due to too much augmentation).

aleju commented 5 years ago

This is a common request, but there is no standard way yet to achieve this. You might be able to hack something using hooks, but this would only get you information about which augmenters were executed and which images were changed in any way.

For the specific case of SomeOf, it has a method _get_augmenter_active(nb_rows, random_state), which you could call with the number of images and a copy of the augmenter's random state. It gives you a binary matrix N_images x N_augmenters, where each value denotes whether the specific augmenter is activate for the image (1) or not (0). It should match the matrix that the augmenter will sample.

For dropout it's harder, as it is a parameter within a parameter - a Uniform wrapped by a Bernoulli. The only way I see would be to define a stochastic parameter that feeds through inputs and outputs to/from an underlying parameter and simultaneously logs the received outputs to an internal variable, so that they could be accessed and read out later on. Then you could do something like Dropout(p=Bernoulli(LogResultsOf(Uniform(a, b)))).

CMCDragonkai commented 5 years ago

This would be really useful for figuring why this is happening: https://github.com/aleju/imgaug/issues/109

AlexanderKazakov commented 5 years ago

Dirty but fast and useful solution is to add something like print('samples drawn: ' + str(samples)) to the end of StochasticParameter.draw_samples and print('augmenter: ' + self.name) to the beginning of Augmenter.augment_images.

This logs all the choices and parameters

cortical-iv commented 5 years ago

I would just like to know if no augmenter was applied in my sequence where each one has probability of 0.5. Then I just won't use that non-augmented data.

Update: this is a hack but I just started checking directly:

if if np.array_equal(image_aug, image):
     print("No changes to original.")
     continue  #do not use image_aug
else: 
    #do fun stuff, use the augmented image

It is surprisingly efficient b/c numpy.

But frankly this is a tricky issue and brings up interesting questions about the sample space someone should study.

TyrionChou commented 4 years ago

hi~, who have a sample of _get_augmenter_active(nb_rows, random_state)?