About data augmentation

richardaecn / class-balanced-loss

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

MIT License

597 stars 68 forks source link

About data augmentation #4

Closed guantinglin closed 4 years ago

guantinglin commented 5 years ago

Hi, may I ask a question about the description in the paper?

As you mentioned in paper in the last paragraph of chapter 3.1:

"the stronger the data augmentation is, the smaller the N will be. The small neighboring region of a sample is a way to capture all near-duplicates and instances that can be obtained by data augmentation"

Shouldn't stronger data augmentation technique provide more samples of the same class(S) that makes the volume of them (N) larger?

richardaecn commented 4 years ago

Hi @guantinglin, "Presumably, the stronger the data augmentation is, the smaller the N will be" is a hypothesis that not verified by experiments. The intuition is that the stronger the data augmentation is, the more near-duplicate samples can be augmented, therefore the total number of unique samples N would be smaller.