facebookresearch / unnas

Code for "Are labels necessary for neural architecture search"
MIT License
92 stars 15 forks source link

Why the result on ImageNet of DARTS in the paper is much better than the original paper? #3

Closed PencilAndBike closed 3 years ago

PencilAndBike commented 4 years ago

The top1 accuracy on ImageNet of DARTS in original paper is 73.3% and in the paper is 76.3%. Some papers report that it could be hard to be reproduced even from original codes because DARTS is prone to converge to 'Skip-connect'. So is there any improvement or fixed bugs of DARTS?

chenxi116 commented 4 years ago

There are two parts to the improvement.

One is improvement in the evaluation phase, i.e. in training a found architecture, we are adopting the hyperparameters used by P-DARTS/PC-DARTS but not the original DARTS paper, with some differences being larger batch size and longer training epochs. This alone improved from 73.3% to 74.9%, as we reported in Table 1.

The rest of the improvement should come from the search phase. I would say the main differences are: (a) changing the search dataset from CIFAR-10 to ImageNet, and (b) changing the search task from classification to some self-supervised objective. Note that we did quite minimal hyperparameter tuning given these two presumably big changes.

PencilAndBike commented 4 years ago

There are two parts to the improvement.

One is improvement in the evaluation phase, i.e. in training a found architecture, we are adopting the hyperparameters used by P-DARTS/PC-DARTS but not the original DARTS paper, with some differences being larger batch size and longer training epochs. This alone improved from 73.3% to 74.9%, as we reported in Table 1.

The rest of the improvement should come from the search phase. I would say the main differences are: (a) changing the search dataset from CIFAR-10 to ImageNet, and (b) changing the search task from classification to some self-supervised objective. Note that we did quite minimal hyperparameter tuning given these two presumably big changes.

Thanks for your reply. I think evaluation phase could be much less influenced than search phase. But the difference b) may be irrelevant to the result of 76.3% because it's on classification task but not on self-supervised objective?

chenxi116 commented 4 years ago

Sure. I always remembered my results as around-the-same-ballpark, so I didn't instantly recognize that the specific 76.3% you are mentioning is a supervised classification result.