Currently, our NodeAttention applies linear layer W on all neighbor pairs of node features, concatenates them, applies attention with a and computes attention coefficients using leaky relu.
GATv2 first concatenates node features and applies linear layer W, leaky relu and finally the attention layer a.
https://arxiv.org/pdf/2105.14491.pdf
Currently, our NodeAttention applies linear layer W on all neighbor pairs of node features, concatenates them, applies attention with a and computes attention coefficients using leaky relu.
GATv2 first concatenates node features and applies linear layer W, leaky relu and finally the attention layer a.