keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.42k stars 82 forks source link

cosine learning rate annealing #26

Closed shuuchen closed 1 year ago

shuuchen commented 1 year ago

Hi,

Have you implemented cosine learning rate annealing as talked in the paper?

Thank you!

keyu-tian commented 1 year ago

Yes, you could see https://github.com/keyu-tian/SparK/blob/main/pretrain/utils/lr_control.py#L11. And this function is only called in https://github.com/keyu-tian/SparK/blob/main/pretrain/main.py#L146.