changlin31 / DNA

(CVPR 2020) Block-wisely Supervised Neural Architecture Search with Knowledge Distillation
235 stars 35 forks source link

What is the transfer setting for CIFAR? #3

Closed serser closed 4 years ago

serser commented 4 years ago

In the paper, I notice there are CIFAR transfer results on EfficientNet/MixNet and also DNA. Could you be kind to share the details? E.g., what kind of network modifications need to be done for CIFAR? Did you upscale to 224*224? What are the hyperparameters (lr, optimizer, epochs, batch size etc.) for transfer training? Any hint is deeply appreciated.

jiefengpeng commented 4 years ago

There is no need to modify network. Yes, We upscale CIFAR to 224*224 and perform transfer learning with the same settings as EfficientNet. More detail can be found in 'Do better imagenet models transfer better?'. From our experiences, you can try a large batch size like 1024 and sgd optimizer with initial learning rate 0.06 and epochs 80.

serser commented 4 years ago

Thanks for the help! I've checked EfficientNet that says it uses settings from Do... and GPipe while GPipe gives some info but leaves the rest in a seemingly undisclosed supplementary. :) Nevertheless, I'll try out with your generous suggestion.

jiefengpeng commented 4 years ago

You are welcome :)