Autoencoder in inference mode

pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch

https://pyg.org

MIT License

21.11k stars 3.63k forks source link

Autoencoder in inference mode #1909

Closed nassarofficial closed 3 years ago

nassarofficial commented 3 years ago

❓ Questions & Help

Hello, I am trying to use the variational auto encoder here in test or inference mode. However, as you can see on this line, the encoder requires an input of "train_pos_edge_index". But during a case of inference for example, what would be the input since I wouldn't know which edge indices of my graph are positive or negative?

rusty1s commented 3 years ago

The use-case of the autoencoder example is to reconstruct some missing edges. Therefore, it needs some basic graph to begin with (where some positive edges are missing). If you want to generate graphs completely from scratch, you might want to use its variational variant (by sampling an embedding from a normal distribution). There also exists a lot of research regarding graph generation with GANs, but PyG does not yet provide an example for this line of research.

nassarofficial commented 3 years ago

Thank you very much for the quick response. So I am to understand that "generate graphs" relates to passing a graph with all its edge indices needing link prediction? I am particularly looking for a GNN solution that would perform link prediction for such case, is that what a variational gae would do?

rusty1s commented 3 years ago

I'm not sure what you mean. Can you clarify? AFAIK, these are two separate use-cases. You either want to create a graph completely from scratch, our you want to do link prediction on an incomplete graph. The autoencoder example does the latter, which always requires an (incomplete) graph as input.

nassarofficial commented 3 years ago

I see I had the wrong understanding that the autoencoders would would do link prediction on a complete graph. To clarify, so for the use case in which I would like to train a GNN to take in full graph with a ground truth saying which edge indices are links [0,1], then classify these edges to assign a similarity between nodes on unseen graphs for testing or inference, is not considered a "link prediction" task, if not what's term or method?

rusty1s commented 3 years ago

This seems to be related to Neural Relational Inference where you want to find meaningful connections in a "full" graph.

nassarofficial commented 3 years ago

That's very interesting, thanks a lot for answering my many questions. But is that related to only autoencoders or you think the same can be approached with an EdgeConv with MLP.

rusty1s commented 3 years ago

Well, this is related to autoencoders, but the Encoder-GNN operates on the full graph instead of an incomplete graph. Since you are operating on the full graph, you need to make use of more expressive GNNs, i.e., one that makes use of both source and target node information, or by creating meaningful edge features. EdgeConv looks like a good GNN to use in this scenarios (and is also quite similar to the one Thomas is using in its work). In contrast, something like GCNConv can not learn any meaningful pattern in this scenario.

Note that since you want to operate on the full graph, I suggest to implement the GNN-Encoder module by yourself. PyG's operators work on sparse input graphs, and are therefore not that well suited for operating on full graphs.

nassarofficial commented 3 years ago

Thanks again for the detailed explanation, I will aim at implementing for PyG.