hszhao / SAN

Exploring Self-attention for Image Recognition, CVPR2020.
MIT License
747 stars 132 forks source link

Hi, what is patch-attention? I am not so clear after reading your paper, THX! #5

Closed Xingrun-Xing closed 4 years ago

bluesky314 commented 4 years ago

I think patch-attention is have a window and doing self attention within it. Please correct me if I am wrong. Maybe the author can elaborate more.

hszhao commented 4 years ago

Hi, patchwise attention means that the attention weights are learned based on a patch representation (e.g., features inside a local footprint, Eq. 5). It is different from pairwise attention where attention weights are learned based on paired features (e.g., F_i and F_j).

KnightOfTheMoonlight commented 4 years ago

Hi, @hszhao, I tried to understand what is the specific criteria to choose the paired features. In your paper, you said it is based on a set of indices that specifies which feature vectors are aggregated to construct the new feature. But how to generate this set of indices, the footprint? I didn't find any illustration in the paper. My first guess is it is based on pixel location, a simple version is 8-neighbors. Are there any differences during training and test?