MichiganCOG / A2CL-PT

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization (ECCV 2020)
MIT License
46 stars 8 forks source link

some questions about ACLPT_func #9

Closed Pixie8888 closed 2 years ago

Pixie8888 commented 2 years ago

Hi, I have some questions about the implementation of ACLPT_func in losses.py. (1) line 69, why the index is added 1? image (2) why the gradient need to multiply with xcH?

image
kylemin commented 2 years ago

Hi, Thank you for your interest! (1) The distmatH is N' by C, so distmatH[mask==0] will be N' by C-1. Therefore 1 should be added to make the indices of hard_index_batch match to them of centers_normed, which is C by 1024. Without adding 1, it cannot be used for the indices of centers_normed. Lines 68-69 can be rewritten in a more straightforward and intuitive way as follows:

distmatH[mask==1] = -torch.inf # so torch.max can operate only on the negative entries.
# Or, distmatH[mask==1] -= 2 is also fine
distHicL, hard_index_batch = torch.max(distmatH, dim=1)

The above snippet can replace lines 68-69 and produce the same results.

(2) Each of the three gradients should be multiplied with xcH, xcL, and cc because the gradients are computed on the normalized vectors (featureH_normed, featureL_normed, centers_normed), not on the original vectors. For further details, you can refer to Equation (10) in the supplementary material: link. You can also find and access the supplementary material on the ECVA page.


FYI: Lines 63-64 can also be replaced:

mask = torch.zeros_like(distmatH, dtype=torch.long)
mask[torch.arange(num_pair), label.long()] = 1
Pixie8888 commented 2 years ago

Thanks!