Closed taey16 closed 6 years ago
Thanks for your nice repo. I'm trying to your codes. My question is the paper said about the annealing optimization strategy for A-Softmax loss with introducing lambda. here, your implementation is
self.lamb = max(self.LambdaMin,self.LambdaMax/(1+0.1*self.it )) output = cos_theta * 1.0 output[index] -= cos_theta[index]*(1.0+0)/(1+self.lamb) output[index] += phi_theta[index]*(1.0+0)/(1+self.lamb)
but, i think the cos term is to be scaled by a factor of lambda such that
output = cos_theta * self.lamb output[index] -= cos_theta[index]*(self.lamb)/(1+self.lamb) output[index] += phi_theta[index]*(1.0)/(1+self.lamb)
Please, give me your idea Thanks
the code is right pls have a double check. @taey16
@vzvzx , Yes, you are right. my mistake. Thanks for your reply.
Thanks for your nice repo. I'm trying to your codes. My question is the paper said about the annealing optimization strategy for A-Softmax loss with introducing lambda. here, your implementation is
but, i think the cos term is to be scaled by a factor of lambda such that
Please, give me your idea Thanks