In MV-softmax, the released code use easy_margin like following:
final_gt = torch.where(gt > 0, cos_theta_m, gt)
but the official arcface implement does not use easy_margin, like following:
final_gt = cos(θ + m)
I try mv-softmax using own dataset without easy_margin, but suffering from divergence issue(NAN). I fix divergence issue using easy_margin.
Compared with mv_softmax , there are two diffs, first is the thresh, CurricularFace use self.threshold = math.cos(math.pi - m), this is understandable, second is final_target_logit = torch.where(target_logit > self.threshold, cos_theta_m, target_logit - self.mm), what is intuitive understanding of target_logit - self.mm, where self.mm = math.sin(math.pi - m) * m
In MV-softmax, the released code use easy_margin like following:
but the official arcface implement does not use easy_margin, like following:
I try mv-softmax using own dataset without easy_margin, but suffering from divergence issue(NAN). I fix divergence issue using easy_margin.
Compared with mv_softmax , there are two diffs, first is the thresh, CurricularFace use
self.threshold = math.cos(math.pi - m)
, this is understandable, second isfinal_target_logit = torch.where(target_logit > self.threshold, cos_theta_m, target_logit - self.mm)
, what is intuitive understanding oftarget_logit - self.mm
, whereself.mm = math.sin(math.pi - m) * m