Confusion about the loss fuction

According to the paper, ULIP uses cross-modal contrastive loss to align point cloud features and image/text representations.

However, after reading the code, it seems you deploy cross entropy loss to define the loss function, e.g., the class ULIPWithImageLoss in models/losses.py.

https://github.com/salesforce/ULIP/blob/e3f61ab758b9f485a6c9b0394ecced59773393e0/models/losses.py#L48-L50

I wonder why the cross entropy loss defined here can be treated as the contrastive one? According to my understanding, they have different equations and use cases (cross entroy loss for supervised learning and contrastive loss for unsupervised learning).

It would be appreciated if you can provide more clarifications about that!

salesforce / ULIP

Confusion about the loss fuction #14