dalao, I find that in PaCo or GPaCo the logits on the numerator in the loss function not masked, but the denominator of the loss function is masked cause exp_logits = torch.exp(logits) * logits_mask i think the the logits on the numerator also should masked? and the learnable center is used for predict ground truth to make it become a supervised question? thanks😢
sorry and i want to ask in paper the Remark 2 after use parametric contrastive learning why the probility become alpha/(1+alphaKy) and C become 1/(1+alphaKy) ? sorry i don't know how to compute it thaks 😢
dalao, I find that in PaCo or GPaCo the logits on the numerator in the loss function not masked, but the denominator of the loss function is masked cause
exp_logits = torch.exp(logits) * logits_mask
i think the the logits on the numerator also should masked? and the learnable center is used for predict ground truth to make it become a supervised question? thanks😢