I think the implementation of SphereFace is wrong, because in the original paper of SphereFace:
The hyperparameter 'm', which means the angular restrain, should be no less than 3 in multiclassifition task, but I could not get correct visualizing result when set m bigger than 2.
cos(m*theta) was replaced by another function called 'pht(theta)'.
the feature vector x wasn't normalized in SphereFace, so there is not hyperparameter 's' in SphereFace.
I think the implementation of SphereFace is wrong, because in the original paper of SphereFace: