VICO-UoE / DatasetCondensation

Dataset Condensation (ICLR21 and ICML21)
MIT License
473 stars 91 forks source link

Are there additional data augmentations when training on synthetic data #5

Closed taoyang1122 closed 3 years ago

taoyang1122 commented 3 years ago

Dear authors, I have a quick question about training networks using the synthetic data. Did you apply data augmentations (such as flip, randomcrop, etc.) to the synthetic images or you just use the original synthetic images (let's say 10pic/class then that would be the same 100 images in each iteration)? Thanks for your help.

PatrickZH commented 3 years ago

Yes, we use data augmentation when training models on synthetic data. Please refer to our papers (including the supplementary) and code for the details.

taoyang1122 commented 3 years ago

Hi, I checked the codes, but it seems that you didn't apply data augmentations when training network using the synthetic data. I am running python main.py --dataset CIFAR10 --model ConvNet --ipc 10. In the evaluate_synset(), args.dc_aug_param['strategy'] is None, so it seems it didn't apply data augmentations.

PatrickZH commented 3 years ago

In Dataset Condensation with Gradient Matching (ICLR 2021), we use data augmentation for MNIST dataset in all experiments when training models. For other datasets, we don't use data augmentation. The only exception is that we also used data augmentation for CIFAR10 when comparing with Dataset Distillation (2018). Please refer to Appendix - A Implementation Details - Dataset Condensation - 3rd paragrah. In Dataset Condensation with Differentiable Siamese Augmentation (ICML 2021), we use data augmentation for all experiments.

taoyang1122 commented 3 years ago

Thanks very much for your clarification!