lightly-ai / lightly

A python library for self-supervised learning on images.
https://docs.lightly.ai/self-supervised-learning/
MIT License
2.92k stars 250 forks source link

nan loss in VICEReg #1032

Closed rzamarefat closed 1 year ago

rzamarefat commented 1 year ago

Hi. Thanks U for this awesome repo. I have followed the code found (https://github.com/lightly-ai/lightly/blob/master/examples/pytorch/vicreg.py) and change the backbone to ResNet152 and the input_size in ImageCollateFunction to 224 to be tailored for ResNet152. Also the projection head dims assigned correctly to 2048. But the problem is that after a few steps in the training loop loss becoms so large that it returns inf and nan afterwards. Any suggestion to solve would be appreciated. Screenshot from 2023-01-10 04-06-08

guarin commented 1 year ago

Hi, thanks a lot for the issue report!

We noticed that the VicReg loss is quite sensitive to training parameters and the optimizer. To make training more stable you can do the following:

The paper uses these settings:

Screen Shot 2023-01-10 at 08 57 28

We also just added a VicRegCollateFunction which uses the same augmentation parameters as the paper. You can either get it from the master branch or after the next release.

Let us know if this helps!

rzamarefat commented 1 year ago

Thanks for your fast reply. I will test these tips and let u know the result.