ydwen / opensphere

A hyperspherical face recognition library based on PyTorch
https://opensphere.world/
MIT License
270 stars 29 forks source link

Why use no_grad for computing d_theta #13

Open tangzhongliang opened 1 year ago

tangzhongliang commented 1 year ago

hello, i found this code ignore grad for g_cos_theta and angular margin. https://github.com/ydwen/opensphere/blob/main/model/head/sphereface2.py#L62-L80

Will this not cause network oscillation?

lizhenstat commented 1 year ago

@tangzhongliang I think it is the Characteristic Gradient Detachment to stable training proposed in paper "SphereFace Revived: Unifying Hyperspherical Face Recognition"