princeton-vl / CornerNet

BSD 3-Clause "New" or "Revised" License
2.36k stars 475 forks source link

ae_loss question #51

Closed moothes closed 5 years ago

moothes commented 5 years ago

I check your ae_loss function in kp_utils.py and I find that the 200th line you have dist = dist - 1 / (num + 1e-4). This means you subtract 1/n from each distance? why? I think this term have no effect on calculating the gradients of loss.

heilaw commented 5 years ago

That's correct. This term has no effect on calculating the gradients. I just added it to make sure the AE loss is the same as the equation in the paper.

moothes commented 5 years ago

Thanks for your reply! This paper means the paper that proposed the AE loss or the latest version of your CornerNet paper? In the CornerNet paper that I have, push loss is defined as: default I can't find the 1/n.

heilaw commented 5 years ago

I was referring to the CornerNet paper.

If we do not subtract 1/n in line 200th, the equation would become L_{push} + 1/(n-1) after the summation. It is because the diagonal elements of dist are 1s instead of 0s. We subtract dist by 1/n then the 1s in the diagonal would be cancelled out in the summation.

moothes commented 5 years ago

well, I get it. Thank you for your excellent work and detailed explanation!

igo312 commented 3 years ago

@heilaw directly subtract seems work on all element. does it need using a diag matrix? like dist = dist - (1 / (num + 1e-4))* torch.eye(dist.shape[1])?