keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.41k stars 82 forks source link

How do you adjust lr and wd? #43

Closed pengcheng93 closed 11 months ago

pengcheng93 commented 1 year ago

Hi,

Thanks so much for your great work! I am confused about how you adjust the lr and weight_decay. In pretrain/utils/lr_control.py/lr_wd_annealing, I think lr and wd are only acquired and not changed.

Thank you!

keyu-tian commented 1 year ago

in that function we modify the 'lr' and 'weight_decay' inoptimizer.param_groups, so it will acquire and change lr/wd at the same time.