sail-sg / poolformer

PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
https://arxiv.org/abs/2111.11418
Apache License 2.0
1.3k stars 117 forks source link

I can't load only m48 somehow. #21

Closed ichiyasa0308 closed 2 years ago

ichiyasa0308 commented 2 years ago

thank you for your sharing good code. I have questions.

1 I downloaded poolformer_m48.pth.tar and poolformer_m36.pth.tar and load them, but I can't load only m48 somehow. loading parama are different from saved one. I created model by using code in train.py like this;

args.model = 'poolformer_m48'
model = create_model(
    args.model,
    pretrained=args.pretrained,
    num_classes=args.num_classes,
    drop_rate=args.drop,
    drop_connect_rate=args.drop_connect,  # DEPRECATED, use drop_path
    drop_path_rate=args.drop_path,
    drop_block_rate=args.drop_block,
    global_pool=args.gp,
    bn_tf=args.bn_tf,
    bn_momentum=args.bn_momentum,
    bn_eps=args.bn_eps,
    scriptable=args.torchscript,
    checkpoint_path=args.initial_checkpoint)

Is anything wrong? params of args are default param except for checkpoint_path and pretrained.

2 Also, how can I set up params for creating model like dropout except for drop path. You've described it in running train, like this

DROP_PATH=0.1 # drop path rates [0.1, 0.1, 0.2, 0.3, 0.4] responding to model [s12, s24, s36, m36, m48]
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./distributed_train.sh 8 /path/to/imagenet \
  --model $MODEL -b 128 --lr 1e-3 --drop-path $DROP_PATH --apex-amp
yuweihao commented 2 years ago

Hi @ichiyasa0308 , thanks for your attention.

  1. Can you use md5sum poolformer_m48.pth.tar to check the md5 code (0701bcdf45f6443150537c9cf2982b1f)?
  2. Dropout rate can be controlled by --drop 0.1.