facebookresearch / mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Other
6.93k stars 1.17k forks source link

训练的代码用最新的timm跑不通 #196

Closed kevin-Abbring closed 2 months ago

kevin-Abbring commented 2 months ago

following timm: set wd as 0 for bias and norm layers

param_groups = optim_factory.add_weight_decay(model_without_ddp, args.weight_decay)
optimizer = torch.optim.AdamW(param_groups, lr=args.lr, betas=(0.9, 0.95))
print(optimizer)
loss_scaler = NativeScaler()

其中add_weight_dacay函数没了,改为 optim_factory.param_groups_weight_decay即可