Different Loss Function

xiaofang007 / ViP

[MICCAI 2024 Early Accept, Oral] Aligning Medical Images with General Knowledge from Large Language Models

18 stars 2 forks source link

Hi, thank you for the excellent work! After reviewing the paper and the code, I think the loss function differs between them. In the code, it appears that the loss is calculated as the sum of all description scores within the same class

for i, (k, v) in enumerate(text_features.items()):
  logits[:,i:i+1] += logit_scale*score

followed by cross-entropy with the ground-truth label:

output = self.model(image)
loss = F.cross_entropy(output, label)

However, in the paper, the loss function includes a learnable temperature parameter, 𝛾

Could you clarify if these are indeed different? Thank you!

xiaofang007 / ViP

Different Loss Function #7