Closed JiyueWang closed 3 years ago
@JiyueWang
For the best performance, cutmix_prob=0.5
is used for CIFAR experiments, otherwise, we set cutmix_prob=1.0
including ImageNet experiments.
Thanks for your instant reply.
I notice that this hyper-parameter is also applicable to mixup. As this paper 'WeMix: How to Better Utilize Data Augmentation' showed, the mix of original data and augmented data can help training significantly. I think the comparison of your method with this extra tuned parameter against Mixup on CIFAR is not fair. What do you think?
@JiyueWang
We did search hyper-parameter alpha
for Mixup only at the ICCV submission, but maybe tuning the mixup_prob
will boost mixup's performance also in CIFAR.
(As an excuse, Cutmix still show better performance than mixup without searching prob
)
Indeed we are preparing the extended journal version of Cutmix, so we will rigorously find the best hyper-parameters alpha
and prob
for both mixup and cutmix on CIFAR for the more fair comparison.
Thank you :)
Have you ever try the combination of Mixup and Cutmix. It seems that they are kind of orthogonal.
@JiyueWang
In my experience, a combination of Mixup and Cutmix was okay but I cannot remember the numbers.
Indeed that kind of combination is called cutmixup
as in another paper (https://arxiv.org/abs/2004.00448), also this repogitory do a similar thing (https://github.com/rwightman/pytorch-image-models/pull/218)
It seems that the 'cutmix_prob' is an important hyper-parameter yet you did not mention in the paper. May I ask the value you adopted to get the results in the paper?