Closed Pixie8888 closed 2 years ago
Hi, Thank you for your interest! (1) The distmatH is N' by C, so distmatH[mask==0] will be N' by C-1. Therefore 1 should be added to make the indices of hard_index_batch match to them of centers_normed, which is C by 1024. Without adding 1, it cannot be used for the indices of centers_normed. Lines 68-69 can be rewritten in a more straightforward and intuitive way as follows:
distmatH[mask==1] = -torch.inf # so torch.max can operate only on the negative entries.
# Or, distmatH[mask==1] -= 2 is also fine
distHicL, hard_index_batch = torch.max(distmatH, dim=1)
The above snippet can replace lines 68-69 and produce the same results.
(2) Each of the three gradients should be multiplied with xcH, xcL, and cc because the gradients are computed on the normalized vectors (featureH_normed, featureL_normed, centers_normed), not on the original vectors. For further details, you can refer to Equation (10) in the supplementary material: link. You can also find and access the supplementary material on the ECVA page.
FYI: Lines 63-64 can also be replaced:
mask = torch.zeros_like(distmatH, dtype=torch.long)
mask[torch.arange(num_pair), label.long()] = 1
Thanks!
Hi, I have some questions about the implementation of ACLPT_func in losses.py. (1) line 69, why the index is added 1? (2) why the gradient need to multiply with xcH?