JiahuiYu / slimmable_networks

Slimmable Networks, AutoSlim, and Beyond, ICLR 2019, and ICCV 2019
Other
914 stars 131 forks source link

how can i load us_model at a certain ratio #30

Closed lucaskyle closed 4 years ago

lucaskyle commented 4 years ago

hey there:

7.9s train 1.0 112/500: loss: 0.116, top1_error: 0.042, top5_error: 0.000, lr: 0.035 7.9s train 0.35 112/500: loss: 0.379, top1_error: 0.135, top5_error: 0.003, lr: 0.035 0.7s val 1.0 112/500: loss: 0.819, top1_error: 0.162, top5_error: 0.008 0.7s val 0.35 112/500: loss: 0.590, top1_error: 0.202, top5_error: 0.011

I trained the model on cifar10 the loss and error show that big models have better results than the small models. but the training time indicates that they have totally the same model complexity.

So I checked the code: in train.py:

width_mult = 0.35 or 1.0 by default, when the ratio changes, do model.apply(lambda m: setattr(m, 'width_mult', width_mult))

then I checked the model, the channel numbers didn't change.
because in us_mobilenet_v2.py width_mult = FLAGS.width_mult_range[-1] width_mult inside the model is always 1.0

I thought if the model goes through model.apply(0.35), the channel numbers should be smaller than 1.0 models.

so how can I load a model at a certain ratio? using get_model to get a big trained model(1.0) then how to get a small trained model(like 0.35)

lucaskyle commented 4 years ago

sorry to bother u dude btw

lucaskyle commented 4 years ago

i kinda understand how it works... its kind like small model inside the big model right?

so when we get_model(), always load 1.0 big model.

JiahuiYu commented 4 years ago

@lucaskyle Yes you are right. It is about the implementation that may cause you confusion.

lyn0102 commented 4 years ago

@lucaskyle hi!I try to train this code on cifar10, but error ratio is high, could you share your parameters? Thank you!

lucaskyle commented 4 years ago

@lucaskyle hi!I try to train this code on cifar10, but error ratio is high, could you share your parameters? Thank you!

plz check my github, i just forked this project last year...