keras-team / keras-applications

Reference implementations of popular deep learning models.
Other
2k stars 913 forks source link

Data augmentation with noise resulted in unreasonable performance improvement on ResNet50 #134

Open szlge opened 4 years ago

szlge commented 4 years ago

I trained a ResNet50 using the GTSRB data set (http://benchmark.ini.rub.de/?section=gtsrb&subsection=news), then I "augmented" the previously employed GTSRB data set by randomly generated images of the appropriate dimension and repeated the training procedure. Finally I observed that the "noise-augmented" training procedure resulted in an unreasonable +20% accuracy improvement on the same validation data set. Clearly It is not reasonable, we should expect worse accuracy in the "augmented" case. This situation holds regardless of the number of epochs. The same situation has also been observed on other data sets, such as CIFAR-10.

case01

case02

ketyi commented 4 years ago

Gergo, could you try an evaluation of the model just by the noise_array training data exclusively? How would that go?

szlge commented 4 years ago

In such cases I ended up with an accuracy close to 0, as it is expected.

ketyi commented 4 years ago

At least, that's good.