VicReg implementation gives "nan" loss

ManafMukred commented 1 year ago

I was trying to explore other algorithms like vicreg using LAMB & LARS optimizer, but in both cases the loss is "nan"

`Epoch 1/200 175/175 - 49s - loss: nan - proj_std: nan - val_loss: nan - val_proj_std: nan - binary_accuracy: 0.1000 - 49s/epoch - 281ms/step

Epoch 2/200 175/175 - 36s - loss: nan - proj_std: nan - val_loss: nan - val_proj_std: nan - binary_accuracy: 0.1000 - 36s/epoch - 203ms/step

Epoch 3/200 175/175 - 36s - loss: nan - proj_std: nan - val_loss: nan - val_proj_std: nan - binary_accuracy: 0.1000 - 36s/epoch - 204ms/step `

any suggestions?

owenvallis commented 1 year ago

Hi @ManafMukred, can you share some more details about the model architecture, hyperparameters, and dataset?

ManafMukred commented 1 year ago

@owenvallis I used the same notebook here , and I tried to use the LARS optimizer also with lr = 0.2 * int(BATCH_SIZE / 256) same as the paper says. but I get the same error

owenvallis commented 1 year ago

Thanks for the details, I'll take a look and see if I can repro the issue on my side.

tensorflow / similarity

VicReg implementation gives "nan" loss #335