pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.11k stars 3.63k forks source link

What is difference between using PairTensor and normal Tensor in EdgeConv and other layers. #1542

Closed taeholeee closed 4 years ago

taeholeee commented 4 years ago

❓ Questions & Help

Hi! I'm trying to implement my own model with your great PyG 😄 Before that, there is one thing that I want to make clear. I mean PairTensor in a Message Passing layer. For example, in EdgeConv and as well as CGConv which is the function of my interest, PairTensor should be input of the layer and propagate method of it. In contrast, EdgeConv in the official guide note: https://pytorch-geometric.readthedocs.io/en/latest/notes/create_gnn.html you don't use PairTensor. I have looked for usages that explain how we use PairTensor as input of layer, but I can't find that. So I want to ask you when I should use PairTensor as input of layer, how it act during the function propagate, message and update are running. And hopefully and gently ask you update the guide with this information. If there is an official or unofficial document that explain what I want to know, you can tell me a link instead of direct explanation here.

Thank you in advance 😃

rusty1s commented 4 years ago

Hi and thanks for this issue. I think your absolutely right. This is something which is currently not documented at all, but it is very intuitive once understood:

PyTorch Geometric 1.6.0 has introduced support for more and more GNN operators working on bipartite graphs. Bipartite graphs are a more general class of graphs, with graphs containing two node sets (sending and receiving nodes). Even a simple graph can be seen as a bipartite graph with the same node set.

Here comes PairTensor into play. A PairTensor = Tuple[Tensor, Tensor] is a tuple holding two node feature matrices. The first matrix in this tuple sends its information (by creating messages) to the receiving node set. The second tensor is used to combine the aggregated messages from the sending nodes with the current node feature set. In practice, this looks like the following:

conv((x_src, x_dst), edge_index)

In the MessagePassing interface, x_j will gather information from x_src, while x_i will gather information from x_dst.

You can simply ignore this feature if you do not operate on bipartite graphs (conv((x, x), edge_index behaves the same as conv(x, edge_index)), but bipartite graph support is something which is cool to have, especially since it allows you to easily use NeighborSampler for scaling your GNN to large graphs, or for learning in heterogeneous graphs.

I hope this clarifies your questions. I will work on better documentation regarding this feature.

taeholeee commented 4 years ago

Hi and thanks for this issue. I think your absolutely right. This is something which is currently not documented at all, but it is very intuitive once understood:

PyTorch Geometric 1.6.0 has introduced support for more and more GNN operators working on bipartite graphs. Bipartite graphs are a more general class of graphs, with graphs containing two node sets (sending and receiving nodes). Even a simple graph can be seen as a bipartite graph with the same node set.

Here comes PairTensor into play. A PairTensor = Tuple[Tensor, Tensor] is a tuple holding two node feature matrices. The first matrix in this tuple sends its information (by creating messages) to the receiving node set. The second tensor is used to combine the aggregated messages from the sending nodes with the current node feature set. In practice, this looks like the following:

conv((x_src, x_dst), edge_index)

In the MessagePassing interface, x_j will gather information from x_src, while x_i will gather information from x_dst.

You can simply ignore this feature if you do not operate on bipartite graphs (conv((x, x), edge_index behaves the same as conv(x, edge_index)), but bipartite graph support is something which is cool to have, especially since it allows you to easily use NeighborSampler for scaling your GNN to large graphs, or for learning in heterogeneous graphs.

I hope this clarifies your questions. I will work on better documentation regarding this feature.

Thank you for your kind and fast answer 👍 As you said it is really intuitive so I can easily understand it. I also appreciate that you will document with regard to this feature. Have a good day!