Understanding the contrastive loss implementation

Dear,

I am trying to understand your custom contrastive loss class. How I understand it, it correctly computes the positives by shifting the diagonal by batch_size and - batch_size to compute the nominator. But then in the compute of the denominator, the negative mask is defined by the inverse of torch.eye(). As I understand, this means that only the self similarity ( which is always 1) is removed from the denominator but the similarities between a patch and it's augmented version are still included?

I would personally implement it like this:

    def create_negative_mask(self, batch_size):
        N = 2 * batch_size
        mask = torch.ones((N, N), dtype=bool)
        mask.fill_diagonal_(0)
        for i in range(batch_size):
            mask[i, batch_size + i] = 0
            mask[batch_size + i, i] = 0
        return mask

Will this not result in an unstable training? I am asking because I don't seem to be able to get the contrastive loss to decrease. Attached is my total loss for a batch size of 12,24 and 48. The rotational loss and reconstruction loss are 0.3 and 0.1 for all models respectively, so the total loss is dominated by the contrastive loss not going down.

Kindly, Joris

Project-MONAI / research-contributions

Understanding the contrastive loss implementation #389