greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.24k stars 270 forks source link

Inductive Representation Learning on Large Graphs #543

Open agitter opened 7 years ago

agitter commented 7 years ago

https://arxiv.org/abs/1706.02216

Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.

One application is predicting protein function from tissue-specific networks. I didn't check their reference to see where those networks come from.

mrwns commented 7 years ago

PPI networks are one of the standard benchmarks in learning embeddings for large graphs. it could probably make up another mini section, the topic is very exciting :)