Closed Rezamh13 closed 1 year ago
Hi, this is because we used the contrastive loss formulation from MoCo paper. They "simplify" the loss to a cross-entropy function. In the end is the same, but it simplifies the similarity computation between positives and negatives.
@nuneslu Could you share the pre-training loss curve?
Hi, yes, this is the loss curve:
But note that in the beginning is normal for the loss to increase a bit since it is still accumulating segments in the feature bank:
Also, in case you are interested, we have just released the code for our new pre-training method here https://github.com/PRBonn/TARL
I will close this now. If you have any more questions, feel free to reopen it.
Hi, I was browsing through the code, in
contrastive_train.py
where you importedContrastiveLoss
but setcriterion = nn.CrossEntropyLoss().cuda()
. Could you explain where exactlyContrastiveLoss
is used?