msegala / Kaggle-National_Data_Science_Bowl

16 stars 8 forks source link

About real-time data augmentation #1

Open pengpaiSH opened 9 years ago

pengpaiSH commented 9 years ago

@msegala Hi msegala or mike @kaggle, first of all thank you for sharing your code. I have been upset with data augmentation for days. And I notice that you have also forked https://github.com/benanne/kaggle-galaxies/blob/master/realtime_augmentation.py in the augmentation part. However, I am still struggling with your implementation. Would you please detail it ? Thank you.

msegala commented 9 years ago

sure, I will do this today or tomorrow but just to quickly address this here is a bit on info. All you really need to worry about is the function. At first I was doing random augmentations of shift, rotation, shear, and zoom BUT I then opted to only use a small set of rotations and flipping. See a bit of details inline.

    def random_perturbation_transform(zoom_range, rotation_range, shear_range, translation_range, do_flip=True):
        # random shift [-10, 10] - shift no longer needs to be integer!
        shift_x = np.random.uniform(*translation_range)
        shift_y = np.random.uniform(*translation_range)
        translation = (shift_x, shift_y)

        # random rotation [0, 360]
        rotation = np.random.uniform(*rotation_range) # there is no post-augmentation, so full rotations here!
        # random shear [0, 20]
        shear = np.random.uniform(*shear_range)

        # random zoom [0.9, 1.1]
        # zoom = np.random.uniform(*zoom_range)
        log_zoom_range = [np.log(z) for z in zoom_range]
        zoom = np.exp(np.random.uniform(*log_zoom_range)) # for a zoom factor this sampling approach makes more sense.
        # the range should be multiplicatively symmetric, so [1/1.1, 1.1] instead of [0.9, 1.1] makes more sense.

        #### RESET AUGMENTATION AND NO LONGER DO RANDOM AUGMENTATION BUT RATHER A SMALLER SET
        translation = (0,0)
        rotation = 0.0
        shear = 0.0
        zoom = 1.0

        #### ONLY PERFORM 6 POSSIBLE ROTATIONS
        rotate =  np.random.randint(6)
        if rotate == 0:
            rotation = 0.0
        elif rotate == 1:
            rotation = 45.0
        elif rotate == 2:
            rotation = 90.0
        elif rotate == 3:
            rotation = 135.0
        elif rotate == 4:
            rotation = 180.0
        else:
            rotation = 270.0

        ## PEFORM FLIPPING OF THE IMAGE
        if do_flip and (np.random.randint(2) > 0): # flip half of the time
            shear += 180
            rotation += 180
            # shear by 180 degrees is equivalent to rotation by 180 degrees + flip.
            # So after that we rotate it another 180 degrees to get just the flip.            

        '''
        print "translation = ", translation
        print "rotation = ", rotation
        print "shear = ",shear
        print "zoom = ",zoom
        print ""
        '''

        return build_augmentation_transform(zoom, rotation, shear, translation)
pengpaiSH commented 9 years ago

Then, I will wait for your details. The most concern is: if I have load the training dataset X_train(should be a numpy.array), then how could I get a augmented data X_augment by your functions ? It's late in my time. Good night. See you tomorrow ~!

pengpaiSH commented 9 years ago

@msegala I am still waiting for you further discussion : )

msegala commented 9 years ago

I don't understand your concern. The class clearly returns Xb and yb which are both numpy arrays. Please elaborate your concerns as I have already commented the main function of the script in the above post.