ronghuaiyang / arcface-pytorch

1.74k stars 392 forks source link

What does eazy_margin in models.metrics.ArcMarginProduct mean? #48

Open tengjn opened 4 years ago

tengjn commented 4 years ago

I don't get the lines below:
if self.easy_margin: phi = torch.where(cosine > 0, phi, cosine) else: phi = torch.where(cosine > self.th, phi, cosine - self.mm) in which, self.mm = math.sin(math.pi - m) *m . What does mm mean?

BTW, I don't think the condition "cosine > 0" equals to the target position. The implementation seems to be different from the paper.

ChiSuWq commented 4 years ago

Hi guy. I can only explain the 'else'. When it reach this 'else', then we can't calculate cos(theta+m) directly due to theta + m > pi, so here the self.mm actes as the cos(theta + m)'s one order Tayler extension to approximate.

tengjn commented 4 years ago

Hi guy. I can only explain the 'else'. When it reach this 'else', then we can't calculate cos(theta+m) directly due to theta + m > pi, so here the self.mm actes as the cos(theta + m)'s one order Tayler extension to approximate.

Hi, thanks for your reply. But I still don't understand. One order Taylor extension of cos(theta + m) should be '' cos(m) - (theta - m)*sin(theta + m) '', which is much different from self.mm.

ChiSuWq commented 4 years ago

Hi, actually, we do one order Taylor extension from theta as the start point. Then, the formula goes to 'cos(θ) - m * sin(θ + m)' if I do not miswrite it.

tengjn commented 4 years ago

image In your case, you regard m as variable and theta as constant. But I think it's opposite. The m should be constant, which is 'a' in the figure.

ChiSuWq commented 4 years ago

Actually, you are maybe wrong. The 'm' changes θ to θ+m, and equally in the Tayler extension, θ is the start point. So in your picture, 'x' equals θ+m while ‘a’ is same as θ. In other words, The standard Taylor extension is f(x + Δx) and m is Δx here.

SJHNJU commented 4 years ago

In my opinion, when easy_margin is True, margin is added only when θ < 90, which means the arcface only works when model is roughly trained.

Ontheway361 commented 4 years ago

@ChiSuWq so, for the else branch, we need to deal with (theta + m) > math.pi. if cos(theta) > cos(math.pi - m) means theta + m < math.pi, so phi = phi; else means theta + m >= math.pi, we use Talyer extension to approximate the cos(theta + m). if fact, cos(theta + m) = cos(theta) - m sin(theta) >= cos(theta) - m sin(math.pi - m)

doitslow commented 4 years ago

Any body wonders why the original implementation in MXNet just used a mx.sym.cos(), not giving consideration to (theta + m) > pi?

tks1998 commented 3 years ago

@ChiSuWq so, for the else branch, we need to deal with (theta + m) > math.pi. if cos(theta) > cos(math.pi - m) means theta + m < math.pi, so phi = phi; else means theta + m >= math.pi, we use Talyer extension to approximate the cos(theta + m). if fact, cos(theta + m) = cos(theta) - m sin(theta) >= cos(theta) - m sin(math.pi - m)

So What happens when I replace m+theta >Pi equal Pi and in this code I replace this condition by cos(m+theta) = cos(Pi) ?

bilzard commented 2 years ago

It seems to me that the term msin(pi - m) has no meaning other than to make the similarity function monotonically decreasing. Figure 1 illustrates the function f=cos(\theta) m sin(\pi - m). Although no proof has been provided, it appears that this function is monotonically decreasing for theta=0.1, 0.01, 0.001.

A simpler realization of the same could be f=cos(theta) - (1 + cos(pi - m)) (Figure 2). In this case, the function is continuous at theta=pi - m.

The simulation code is here: https://www.kaggle.com/code/tatamikenn/arcface-visualize-easy-margin?scriptVersionId=96805831

Fig1. Screen-Shot-2022-05-28-at-15-11-30

Fig2. Screen-Shot-2022-05-28-at-15-16-26

Ontheway361 commented 2 years ago

您好,我已经收到您的邮件,谢谢。