facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Apache License 2.0
6.06k stars 885 forks source link

The averaged learning rate in log.txt #254

Open haohang96 opened 9 months ago

haohang96 commented 9 months ago

I find that in https://dl.fbaipublicfiles.com/dino/dino_deitsmall16_pretrain/dino_deitsmall16_pretrain_log.txt, the average learning rate of each epoch is not the same as the log in my reproduce results.

According to the deit-small args(https://dl.fbaipublicfiles.com/dino/dino_deitsmall16_pretrain/args.txt) and the cos_schedule code( warmup_schedule = np.linspace(start_warmup_value, base_value, warmup_iters)), the averaged learning rate in epoch0 is: np.linspace(0,0.002,12510)[:1251].mean() = 9.992805180270206e-05. But the average lr in https://dl.fbaipublicfiles.com/dino/dino_deitsmall16_pretrain/dino_deitsmall16_pretrain_log.txt is 9.999999999999726e-06.

I wonder what causes such a difference.

Thanks