xiaomi-automl / FairNAS

FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search
303 stars 58 forks source link

Why you add Dropout on you model? #3

Closed PistonY closed 5 years ago

PistonY commented 5 years ago

You paper say 'In order to be consistent with the previous works, we don’t employ any other tricks like dropout [21], cutout [6] or mixup [28], although they can further improve the scores on the test set.' https://github.com/fairnas/FairNAS/blob/418b892c17016006f9edc33fea2c50f674d86ff0/models/FairNAS_A.py#L104 Is there any misunderstanding?

cxxgtxy commented 5 years ago

I read the paper carefully. It says that the rank experiment is performed "we don’t employ any other tricks like dropout [21], cutout [6] or mixup [28]". When compared with ProxylessNAS, they said they used the training tricks like google's MNASNET(google used dropout.). Thus, it should be two different experiments.

PistonY commented 5 years ago

Ok that, so the final result FairNAS-A got 75.34 top-1 on ImageNet using dropout? And is there any additional tricks?

cxxgtxy commented 5 years ago

I think so. However, I need to reproduce it to answer your question.

fairnas commented 5 years ago

@PistonY As @cxxgtxy says, in order to rank models, the sampled models are not trained with dropout. For the published model, it is thus trained to be on par with MnasNet, ProxylessNet.

fairnas commented 5 years ago

@PistonY Please refer Section 6.3 as we use similar hyperparameters and tricks like in literature MnasNet [24]. So the FairNAS-A is indeed trained with dropout.

PistonY commented 5 years ago

@cxxgtxy @fairnas Ok, thanks! I also want to reproduce the results of the paper myself, so I need to make sure my understanding is right.