Open zhang1hongliang opened 2 years ago
Even if I am not Dr. Chen I invite you to take a look at the original SIMSIAM paper more specifically in Section 3.
A prediction MLP head [15], denoted as h, transforms the output of one view and matches it to the other view. Denoting the two output vectors as p1 ,h(f(x1)) and z2 ,f(x2), we minimize their negative cosine similarity:
Keyword here is "negative". So it is the correct behaviour. In other words the closest you get to -1 the better.
@endernewton , Hi, Dr Chen, thank you for a high quality work, I meet the case that the value of loss function nn.CosineSimilarity is negative, it happens when the backbone is resnet12 and the dataset is CIFA-FS. Can you help me solve this issue?