Open tangzhongliang opened 1 year ago
hello, i found this code ignore grad for g_cos_theta and angular margin. https://github.com/ydwen/opensphere/blob/main/model/head/sphereface2.py#L62-L80
Will this not cause network oscillation?
@tangzhongliang I think it is the Characteristic Gradient Detachment to stable training proposed in paper "SphereFace Revived: Unifying Hyperspherical Face Recognition"
hello, i found this code ignore grad for g_cos_theta and angular margin. https://github.com/ydwen/opensphere/blob/main/model/head/sphereface2.py#L62-L80
Will this not cause network oscillation?