pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.41k stars 3.67k forks source link

Creating dataset of graphs with own edge index #932

Open Maddy12 opened 4 years ago

Maddy12 commented 4 years ago

❓ Questions & Help

Hello, I am trying to understand how to properly use the dataset class in torch-geometric. If you are trying to classify images that are converted into a graphical representation, each with their own edge index, how are these graphs represented batchwise? My concern is that if each image has its own edge index that is indexing the nodes for that image, concatenating them is not the easy solution.

I am not sure I clearly understand how batches are handled here, especially when using the layers.

I would be grateful for any examples and/or explanations you can provide.

Thank you.

rusty1s commented 4 years ago

You can read about mini-batch handling here. Overall, edge indices are stacked diagonally, so that no indices overlap.

Maddy12 commented 4 years ago

So in summary, by using the pytorch_geometric DataLoader, this is all handled internally?

rusty1s commented 4 years ago

Yes :)

thorzhong commented 3 years ago

Yes :)

Hi, I'm wondering that if the Data Loader is not used, and the edge index for the first graph is [[0,1], [0,2]], the second graph is [[0,1], [0,2], [0,3]]. They are concatenated as [[0,1], [0,2], [0,1], [0,2], [0,3]] for batch training. Will this be regarded as two graphs in the message passing stage? (This is a graph classification task)

Thank you.

rusty1s commented 3 years ago

They will be concatenated as [[0, 1], [0, 2], [3, 4], [3, 5], [3, 6]], and such will be regarded as two graphs (or a single supergraph holding multiple disconnected subgraphs).

thorzhong commented 3 years ago

Thanks for your reply. So if I use [[0,1], [0,2], [0,1], [0,2], [0,3]] as edge_index. GNN models (like GCNConv) will not regard this as two graphs, right?

rusty1s commented 3 years ago

Yes, it will regard it as a single graph.