mit-han-lab / once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
https://ofa.mit.edu/
MIT License
1.89k stars 333 forks source link

Question about training for model `ofa_D4_E6_K7` #32

Closed gunjupark closed 4 years ago

gunjupark commented 4 years ago

Hello I want to train OFA model(ofa_mbv3) on 'Cifar100' or custom datasets.

so I want to get some training details about first supernet.

When I checked model in progressive-shrinking phase, I saw F.Linear(Kernel transformer) layer's weights were also trained.

When I want to train First Supernet (ofa_D4_E6_K7), should I train there Kernel transform matrix?

And I was wondering If you had some information about OFA net training on other dataset(like cifar 10, 100), I want to know them.

Thank you all the time.

gunjupark commented 4 years ago

on Cifar 10 or Cifar 100 I could train OFA's Init Supernet using paper's setting (Init_LR = 2.6) / 32 = 0.08125 per GPUs Supernet was trained quite well.

darrenzhang1007 commented 4 years ago

on Cifar 10 or Cifar 100 I could train OFA's Init Supernet using paper's setting (Init_LR = 2.6) / 32 = 0.08125 per GPUs Supernet was trained quite well.

Hello,I want to ask you two questions:

  1. How did you modify the code to be used for the classification of the cifar100 dataset;
  2. After you modify the code, the cifar10 dataset has 100 categories, but the official pre-training model is 1000 classifications for ImageNet Dataset, and the loading pre-trained model will fail to load. Data dimensions do not match. How did you solve it? Lark20201120-111456

Thanks a lot , Looking forward to your reply

gunjupark commented 3 years ago

on Cifar 10 or Cifar 100 I could train OFA's Init Supernet using paper's setting (Init_LR = 2.6) / 32 = 0.08125 per GPUs Supernet was trained quite well.

Hello,I want to ask you two questions:

  1. How did you modify the code to be used for the classification of the cifar100 dataset;

I use same class (OFAMobileNetV3) in ofa's codes to train supernet. (e.g. progressive_shrinking.py)

Instead, I preprocessed Cifar's Images to fit Imagenet's Images size (e.g 40 -> 224 random resized crop)

  1. After you modify the code, the cifar10 dataset has 100 categories, but the official pre-training model is 1000 classifications for ImageNet Dataset, and the loading pre-trained model will fail to load. Data dimensions do not match. How did you solve it? Lark20201120-111456

Thanks a lot , Looking forward to your reply

I trained Supernet using cifar10 from scratch (I didnt use pretrained model for Imagenet) so I modify some codes in ofa/imagenet_codebase/data_provider & run_manager to set n_classes = 10 or 100

My step is 1. Training Supernet -> 2. Finetuing for all subnets (using Progressive Shirnking) -> 3. run EA codes in tutorial

Jon-drugstore commented 3 years ago

on Cifar 10 or Cifar 100 I could train OFA's Init Supernet using paper's setting (Init_LR = 2.6) / 32 = 0.08125 per GPUs Supernet was trained quite well.

how to change the code for training the supernet? using the train_ofa_net.py?Thanks