clovaai / CutMix-PyTorch

Official Pytorch implementation of CutMix regularizer
MIT License
1.22k stars 159 forks source link

cutmix as finetune technique #6

Closed qianyizhang closed 5 years ago

qianyizhang commented 5 years ago

first of all, very well documented experiments! the learning curve in fig.2 shows the model trained with cutmix outperforms baseline only after 2x LR adjustments, i wonder if this is true across your other experiments.

It does make sense since cutmix is indeed a regularization which prevents model from memorize/overfit to training data. But WHEN does it take effect is also an interesting question to ask.

If it's only effective at the very late of training stage, one can simply use it as fine-tuning technique and save time/resource from retraining from scratch.

hellbell commented 5 years ago

@qianyizhang Thank you for your interesting question! However, we don't think CutMix (or other regularization methods) only effective at the very late of the training stage. In our conjecture, CutMix prevents to converge into bad local minima by making the problem difficult, so applying CutMix only for a pretrained network would not work well. But it is worth to try your idea. If you get a good result, please let me know :)