salesforce / ULIP

BSD 3-Clause "New" or "Revised" License
420 stars 38 forks source link

Confusion about the loss fuction #14

Closed auniquesun closed 1 year ago

auniquesun commented 1 year ago

According to the paper, ULIP uses cross-modal contrastive loss to align point cloud features and image/text representations.

However, after reading the code, it seems you deploy cross entropy loss to define the loss function, e.g., the class ULIPWithImageLoss in models/losses.py.

https://github.com/salesforce/ULIP/blob/e3f61ab758b9f485a6c9b0394ecced59773393e0/models/losses.py#L48-L50

I wonder why the cross entropy loss defined here can be treated as the contrastive one? According to my understanding, they have different equations and use cases (cross entroy loss for supervised learning and contrastive loss for unsupervised learning).

It would be appreciated if you can provide more clarifications about that!

Tycho-Xue commented 1 year ago

@auniquesun, check out CLIP's paper https://arxiv.org/pdf/2103.00020.pdf, on page number 5, the top left corner. You can think in this way, it's a classic info NCE loss, within a batch of samples, you want the model to maximize the possibilities of the corresponding pairs, the corresponding pair is the positive sample, the other samples are the negative samples.