facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Apache License 2.0
6.25k stars 905 forks source link

how to train DINO with MSE loss. #240

Open Backdrop9019 opened 1 year ago

Backdrop9019 commented 1 year ago

Could you please share the specific settings you used when training DINO with MSE and how the results compare to those of BYOL? In the case of BYOL, MSE is performed on the L2-normalized output. How was DINO trained? In my experience, if I train with MSE without L2-normalization, collapse occurs