Open hamrain opened 4 months ago
glad to see your question. IT's the default loss function in contrastive learning, such that unnecssary to discard the normal negative. In our knowledge, most (even all) related works about hard negative still use normal negative such as SimCSE, MixCSE.
I don't understand the meaning of the denominator in your formula Lcl. Or rather, why add Xj - as the denominator when there is already a clustering group with hard negative Xj+.