VDIGPKU / DADA

[ECCV 2020] DADA: Differentiable Automatic Data Augmentation
MIT License
188 stars 29 forks source link

Two Questions about some details of the paper #4

Open Akamight opened 3 years ago

Akamight commented 3 years ago

Hello! I have some minor questions about certain details in the training of the network itself.

  1. How are the results of the paper acquired?

From the paper, its said that: Following [3, 10, 15], we search the DA policies on the reduced datasets and evaluate on the full datasets. Furthermore, we split half of the reduced datasets as training set, and the remaining half as validation set for the data augmentation search. So what is the workflow for the training of a neural network with DADA? Do we search on the dataset using train_search_paper, then transfer the policies and use it for training? If yes, then where is the method used to transfer the searched policies to the training? If no, then how is the validation data used? It seems like you are only using half of the data to train the neural network (train_portion = 0.5).

  1. Is any other sub policy depth/sub policy count considered for search?
  2. Why is ColorJitter used in conjunction with the DADA Policy? the subpolicies seem to be able to include it anyways.
latstars commented 3 years ago

The workflow:

  1. make a sub-set based on the origin training dataset
  2. split the above sub-set into a training set and a validation set in our search stage
  3. search the data augmentation policy using code search_relax/train_search_paper.py
  4. save the DA policy found by the above step to fast-autoaugment/FastAutoAugment/genotype.py
  5. train the new models using the origin training dataset with 'python FastAutoAugment/train.py'
  6. get the performance on the origin validation set after the above training step is over

Is any other sub policy depth/sub policy count considered for search: To keep fair comparision with other methods, we try only the setting where sub-policy with only two image operations. More image operations indeed means more sub-policies count. And the dimension of data augmentation parameters will also be bigger, which requires more epochs to search.

Why is ColorJitter used in conjunction with the DADA Policy: ColorJitter is only used in ImageNet dataset, which is the standend setting when training ResNet-50. Furthermore, DADA policy only contains the color operation, not the ColorJitter. ColorJitter is a random color operation, while the color opeartion has fixed magnitude and certain probability to be selected and be applied.