sthalles / SimCLR

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
https://sthalles.github.io/simple-self-supervised-learning/
MIT License
2.21k stars 458 forks source link

NT_Xent Loss function: all negatives are not being used? #11

Closed DAVEISHAN closed 4 years ago

DAVEISHAN commented 4 years ago

Hi @sthalles , Thank you for sharing your code!

Pl correct me if I am wrong: I see that in line loss/nt_xent.py line 57 (below) you are not computing contrastive loss for all negative pairs as you are reshaping total negatives in 2D array i.e. only a part of negative pairs are being used for a single positive pair, right? :

_negatives = similarity_matrix[self.mask_samples_from_same_repr].view(2 * self.batch_size, -1)_
_logits = torch.cat((positives, negatives), dim=1)_

Hope to hear from you soon.

-Ishan

DAVEISHAN commented 4 years ago

@alessiamarcolini pl help me with this issue.

sthalles commented 4 years ago

Hi DAVEISHAN,

All negative are taken into consideration. You can check that is the case, by looking at the similarity matrix dimensions. Note that the embeddings are duplicated before computing the matrix, in this way we make sure that we get all negatives. Moreover, you can count the number of negatives your-self. Each positive should have 2×(N−1) negatives. Just put a breakpoint at _negatives = similarity_matrix ... and check the number of columns of the matrix.

Hope it helps.