Closed afilt closed 2 months ago
Hi @afilt - when I was training UNI, the overall loss curve was very smooth. As DINOv2 is very close in SSL implementation to iBOT, many of the suggestions for improving iBOT stability can also carry over, .e.g. - https://github.com/bytedance/ibot/issues/19. Hyper-params such as lowering the temperature, increasing the number of iterations for freezing the network (only training the last layer) during initial training, adjusting clip_grad
, etc. I would suggest performing the short run for ViT-L/16 first to see if this configuration works for you.
Hello, I was wondering if you could provide additional details on the evolution of loss functions during the pre-training of UNI. It has indeed been observed that instabilities or convergence issues may hinder the pre-training. Is this something you already observed ?
Congratulations for this groundbreaking work and for publicly releasing weights.