Open CassNot opened 11 months ago
Good catch! I think the eq 2 in the paper has ignored the 1/(2N)
.
Hi,
I've been reviewing the implementation, and I noticed the line loss = loss.view(anchor_count, batch_size).mean(). Given the computations, it seems that the result would be equivalent to simply using loss.mean(). Could you kindly explain the rationale behind the reshaping here?
I assume it's just for readability?
yeah, it's just helping understand the shape (potentially may help understand what's going on).
Dear authors,
Thank you for your code!
We had a question concerning the loss implementation. We saw that for each minibatch, the mean is computed and not the sum as in the paper (https://arxiv.org/pdf/2004.11362.pdf - equation 2): https://github.com/HobbitLong/SupContrast/blob/331aab5921c1def3395918c05b214320bef54815/losses.py#L96
We were wondering if there was a reason for this choice.
Thank you