PetarV- / GAT

Graph Attention Networks (https://arxiv.org/abs/1710.10903)
https://petar-v.com/GAT/
MIT License
3.24k stars 650 forks source link

broadcasting issue #20

Open FrankCAN opened 5 years ago

FrankCAN commented 5 years ago

Hi Petar, Many thanks for your code.

For the f_1 to transpose(f_2) operation, can I change the f_2 to f_2_neighbor (BxNxKx1) which means that obtain all the k neighbors first before broadcasting. And then operate f_1 + f_2_neighbor. The total dimension will become to BxNxKx1 directly. Do you think it is the same as sparse GAT? Many thanks for your help.

Best Regards Frank

light8lee commented 5 years ago

Hello, do you figure it out this part? I'm also confused about the attn_head() function.

xdql commented 5 years ago

Hello, do you figure it out this part? I'm also confused about the attn_head() function.

I also confused about the attn__head() function. can I learn more about you to figure it out.

FrankCAN commented 5 years ago

Hi, Sorry for me late response, it is the same as sparse GAT.

FrankCAN commented 5 years ago

Please check the paper Neural machine translation by jointly learning to align and translate.