Closed jxyecn closed 5 years ago
While reading your code, I find that "adjust matrix" A is reinitialized every batch. But from your paper, I understand that this A should be learned end to end. Is there anything wrong?
This code contains only Mean Aggregation operation. Since the performance gain of Attention Aggregation is small, we do not use it in practice. You can check the paper Graph attention networks and their code for the implementation of Attention Aggregation.
OK, thanks for your reply!
Hello, I saw that you came up with a novel feature aggregation method "Attention Aggregation"(from 3.3), you said that the element in G is generate by a 2-layer MLP model. How to train this matrix, can you provide more detailed information? Thank you. (I didn't find relative source code, did I missed something?)