VincLee8188 / GMAN-PyTorch

Implementation of Graph Muti-Attention Network with PyTorch
133 stars 29 forks source link

the issue of group attention mechanisms in the paper and group attention in code #10

Open SuperYFan opened 1 year ago

SuperYFan commented 1 year ago

Hi author, you have done a great job and I am very interested in the work you have researched. I have some doubts about the attention aspect of the paper. The group attention designed in the paper divides query, key, value into multiple groups and then computes the attention in parallel, after which each group is max-pooled and then the attention is computed again between groups. However, the code only computes attention within groups, but the maximum pooling and attention between groups seem not to be implemented, which I am a bit confused about. image The code seems to calculate the group's attention and then cat the result directly. Looking forward to your answers and replies!

serenaand commented 7 months ago

I have the same question. And I find it even dosen't divide the groups. You said the "computes attention within groups" just as the multi-head attention.