PetarV- / GAT

Graph Attention Networks (https://arxiv.org/abs/1710.10903)
https://petar-v.com/GAT/
MIT License
3.15k stars 643 forks source link

some question about GAT #46

Closed junkangwu closed 4 years ago

junkangwu commented 4 years ago

Hello, recently when I read paper of GAT again, I found a question and was confused. I hope to obtain your help. The coefficient of $\alpha_{i,j}$ is decided by the features of node i and node j under the supervised learning of those training set nodes with corresponding labels. But if the situation in the traing set is: The two vertices of an edge belong to the training set and the test set respectively. Theoretically, node without label will not be able to use gradient descent for learning. In this way, how does GAT it works? Thanks a lot !

PetarV- commented 4 years ago

Hello,

Thank you for your issue.

The answer to your question depends on whether you're doing transductive or inductive learning.

If transductive, the training algorithm sees all nodes (including test nodes), and then it is used for learning.

If inductive, the test nodes are masked out along with their edges during training and, for the purposes of the training algorithm, are treated as non-existing. At test time we add them back.

Hope that helps!

Thanks, Petar

junkangwu commented 4 years ago

In this way, if transductive, the whole graph will be thrown into training, only the labels of test set are marked?

PetarV- commented 4 years ago

You are correct -- test and validation nodes won't be used for cross-entropy loss.

junkangwu commented 4 years ago

Thanks a lot!!