Be able to use more samples for data augmentation

david-vazquez / mcv-m5

Master in Computer Vision - M5 Visual recognition

13 stars 51 forks source link

Be able to use more samples for data augmentation #14

Closed ArantxaCasanova closed 7 years ago

ArantxaCasanova commented 7 years ago

I have noticed that, even if we use data augmentation, the number of train samples is still the same as the number of train samples in the database.

In order to change that, I propose to add these changes to the configuration files:

data_augmentation= False # If data augmentation is used
data_augmentation_train_samples=30000 #Number of samples per epoch with data augmentation

In order to select if we want to use data augmentation and how much samples we want to use.

IMPORTANT: we would have to change all configuration files to include these lines. Otherwise it may throw errors in the execution.

bluque commented 7 years ago

This possible error can be fixed with a "try:" before the two lines added in configuration.py, so if the variable data_augmentation_train_samples didn't exist, it would work as it was working until now, without any exception raising.

david-vazquez commented 7 years ago

Hi,

I think that this change is not needed.

When you enable data augmentation you do not need to make more samples than in the original training set. For each epoch the data augmentation modifies each of the samples. For each epoch the data augmentation uses a different transformation for each sample.

Why would you like to have more samples per epoch?

ArantxaCasanova commented 7 years ago

I thought of it as a way to increase the number of training images, following the same procedure we used in M3, and not just to add more variations to the input data.

xianlopez commented 7 years ago

Arantxa, I think that if you just increase the number of epochs, you have the same result.

david-vazquez commented 7 years ago

I agree with Xian