final_target_logit = torch.where(target_logit > self.threshold, cos_theta_m, target_logit - self.mm)

In MV-softmax, the released code use easy_margin like following:

final_gt = torch.where(gt > 0, cos_theta_m, gt)

but the official arcface implement does not use easy_margin, like following:

final_gt =  cos(θ + m)

I try mv-softmax using own dataset without easy_margin, but suffering from divergence issue(NAN). I fix divergence issue using easy_margin.

Compared with mv_softmax , there are two diffs, first is the thresh, CurricularFace use self.threshold = math.cos(math.pi - m), this is understandable， second is final_target_logit = torch.where(target_logit > self.threshold, cos_theta_m, target_logit - self.mm), what is intuitive understanding of target_logit - self.mm, where self.mm = math.sin(math.pi - m) * m

HuangYG123 / CurricularFace

final_target_logit = torch.where(target_logit > self.threshold, cos_theta_m, target_logit - self.mm) #31