Closed YeziKung closed 1 year ago
@YeziKung Hi, thanks for your interest!
Regarding to your questions, yes, the default setting alternative=False
indicates a joint training method. With the joint training, it means we only need two classification losses and an independency regularizer which consists of two HSIC losses. Thus, the loss formulas (4)(5) are practically computed by the following two HSIC losses:
loss_hsic1 = -1.0 * self.hsic_loss(alpha_unbias, alpha_bias1) # at the line 200
loss_hsic2 = -1.0 * self.hsic_loss(alpha_unbias, alpha_bias2) # at the line 222
For the choice of EDL loss, yes, you can definitely try the simplest form of evidence_loss, e.g., without using calibration terms. In this case, it intrinsically reduces to the vanilla NLLLoss. Alternatively, you may also check the EDL paper (NeurIPS'18) and see if the other two forms of EDL loss could work or not.
Hope this can help you. Thank you!
With the joint training, it means we only need two classification losses and an independency regularizer which consists of two HSIC losses.
@Cogito2012 Thank you for your timely reply, which seems to prove my understanding of this part of the paper and code is correct. 1.Through your explanation and open supplemental, I think CED_Loss=loss_hsic1+loss_hsic2. However, if the EUC module is not considered, the loss function of the overall model is =vanilla AR_Loss+Loss_factor*CED_Loss(Loss_factor is 0.1 in DEAR).Is that correct? 2.As for the "two classification losses" mentioned in your reply, through the code of debias_head.py, I think loss_cls1, loss_cls2 and loss_cls3 are not needed in joint learning. All we need is alpha_bias = self.exp_evidence(x) + 1 and alpha_unbias = self.exp_evidence(x) + 1. So are the “two classification losses”you mentioned used elsewhere? Or do“ two classification losses”just refer to alpha_xx. 3.Since you have done a lot of experiments to verify the feasibility of the model and module, the code is deeply nested. It is a pity that I have not found the overall Loss, that is, where formula (8) is calculated. Could you please help me to point it out if it is convenient?
谢谢你,包师兄!
@YeziKung For your questions,
vanilla AR_Loss
refers to the sum of the three vanilla EDL losses (on one debiased and two biased branches). loss_cls1
, loss_cls2
) still needed in joint training? Here, my intuition is that, without these two cls losses as strong supervisions, the predicted alpha_bias1
and alpha_bias2
would mean nothing so that during training/optimization, the two branches could easily generate non-meaningfulalpha_bias{1,2}
to be independent of alpha_unbias
. In this case, we cannot say alpha_unbias
is unbiased:) Note that, if we say some features are biased/unbiased, the premise is that they are capable of doing well on recognition, but grounded on spurious/intrinsic visual cues.
Hello I am very interested in your debias research and thank you very much for your selfless open source code. I noticed that “In practice, we also implemented a joint training strategy which aims to optimize the objective of (4) and (5) jointly and we empirically found it can achieve a better performance”is mentioned in the paper. And I found the setting alternative=False in the corresponding code. Is this “joint training”you mentioned? Also, if alternative=False, loss_hsic_f += self.hsic_factor self.hsic_loss(feat_unbias, feat_bias1.detach(), unbiased=True) and loss_hsic_g += -self.hsic_factor self.hsic_loss(feat_unbias.detach(), feat_bias1, unbiased=True) "and their corresponding formulas (4) and (5) don't make sense. In addition, I would also like to ask whether it is necessary to use the simplified evidence_loss from DebiasHead on the closed set if we do not do the open set identification task. I would like to use the NLLLoss instead (you also mentioned that the two are similar in oral). (And I found that edl_loss doesn't work if alternative=False, haha) Of course, I have not done experiments to verify it. Looking forward to your reply!