HobbitLong / SupContrast

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
BSD 2-Clause "Simplified" License
3.12k stars 537 forks source link

SimCLR Loss: False Output (Possibly with Correction) #106

Open vaydemir3 opened 2 years ago

vaydemir3 commented 2 years ago

Hi,

First of all, thanks a lot for sharing your code :) I was having an issue with a particular input to the SimCLR Loss. Here is the code that produces the false output:

# passing a case where each feature vector is same
featTensor = torch.ones((8, 2, 10)) # batch size of 8 with 2 augmented views and feature dimension of 10
featTensor = nn.functional.normalize(featTensor, dim=-1) #l2 normalize the feature vectors
criterion = SupConLoss(temperature=0.07)
loss = criterion(featTensor)

When each feature vector is same, the loss should output log_e(2N-1), to see why

The above code fails to output log_e(2N-1)

The correct usage of the code seems to be dependent on setting both tempreature and base_temperature arguments of the loss function (thanks to my friend @sstojanov):

import torch.nn as nn                                   
import numpy as np                                      

from losses import SupConLoss                           

the_answer = np.log(15)          
print("the answer we want", the_answer)                 

featTensor = torch.ones((8, 2, 10))                     
featTensor = nn.functional.normalize(featTensor, dim=-1)

criterion = SupConLoss(                                 
        temperature=1.0,                                
        base_temperature=1.0)                           

loss = criterion(featTensor)
print("the answer we get", loss)   
tbenst commented 1 year ago

Thanks for this! I was trying to figure out the purpose of base_temperature. It does appear that the code here is different than in the original paper. Curious if anyone has a justification for the presence of base_temperature..? Wasn’t sure if there was a subsequent contrastive loss paper that introduces the idea or if the authors here found it an empirically useful hyperparameter.

adv010 commented 1 year ago

@tbenst I was also curious about the base_temperature parameter. Did you get any insights / applications for the same?