Closed PistonY closed 5 years ago
I read the paper carefully. It says that the rank experiment is performed "we don’t employ any other tricks like dropout [21], cutout [6] or mixup [28]". When compared with ProxylessNAS, they said they used the training tricks like google's MNASNET(google used dropout.). Thus, it should be two different experiments.
Ok that, so the final result FairNAS-A got 75.34 top-1 on ImageNet using dropout? And is there any additional tricks?
I think so. However, I need to reproduce it to answer your question.
@PistonY As @cxxgtxy says, in order to rank models, the sampled models are not trained with dropout. For the published model, it is thus trained to be on par with MnasNet, ProxylessNet.
@PistonY Please refer Section 6.3 as we use similar hyperparameters and tricks like in literature MnasNet [24]. So the FairNAS-A is indeed trained with dropout.
@cxxgtxy @fairnas Ok, thanks! I also want to reproduce the results of the paper myself, so I need to make sure my understanding is right.
You paper say 'In order to be consistent with the previous works, we don’t employ any other tricks like dropout [21], cutout [6] or mixup [28], although they can further improve the scores on the test set.' https://github.com/fairnas/FairNAS/blob/418b892c17016006f9edc33fea2c50f674d86ff0/models/FairNAS_A.py#L104 Is there any misunderstanding?