bighuang624 / AGAM

Code for the AAAI 2021 paper "Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition".
https://kyonhuang.top/publication/attributes-guided-attention-module
10 stars 6 forks source link

why soft margin loss works? #1

Closed valencebond closed 3 years ago

valencebond commented 3 years ago

Thanks for your detailed codes. But the attention alignment loss makes me confused. According to equations 8 and 9 in the paper, Ms and Mg are distributed in [0, 1] after the sigmoid function. Maximizing element-wise multiplication between M{c}^{ag} and M{c}^{sg} will minimize the the loss{i}^{cas}. Thus, the M{c}^{ag} and M{c}^{sg} are both inclined to become 1. This will not make M{c}^{sg} align with M{c}^{ag}, even if M{c}^{ag} are fixed as targets.
In my opinion, the alignment loss makes spatial attention (channel attention) of the self-attention branch attend to all spatial pixels(channel items) by making M{c}^{sg} and M{s}^{sg} equal to 1

bighuang624 commented 3 years ago

@valencebond We are sorry to have uploaded an older version of the code, which included the pure soft margin loss. In the latest version, we apply a normed soft margin loss, which can be viewed as a variant of the cosine distance and is more in line with our purpose. We have updated the code. Also, we would like to share the observation from experiment results that after considering the influence of ramdomly sampled tasks and training, these two version of the loss would not significantly affect the performance of the model.

Much thanks for your attention. If you have any additional questions, welcome to reopen the issue.