Closed YilmazKadir closed 2 years ago
I don't think SimCLR uses stop gradients, but SimSiam does. The paper contains some pseudocode which you can follow. Basically you should call .detach()
on one of the inputs before passing it into the NTXent loss.
Do I need to take care of anything manually while using NTXent Loss in a SimCLR like network (stop gradients etc.) to avoid updating the weights multiple time, as in a SimCLR-like network, same network weights have connections to NTXent Loss via different samples. Can you briefly explain how you handle gradient updates of NTXent Loss.