snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 398 forks source link

Detail about ogbl-citation2 subgraph sampling #160

Closed oisc closed 3 years ago

oisc commented 3 years ago

What is the proper way to sample a subgraph from a large graph like MAG citation graph ?

rusty1s commented 3 years ago

There is no clear consensus on that one yet for heterogeneous graphs: What we do in our script is to convert the MAG graph into a homogeneous one with distinct edge types, and apply standard sampling techniques on it. If you want your edge types to be uniformly distributed after sampling, you may want to look into the Heterogeneous Graph Transformer model.

oisc commented 3 years ago

Did the "standard sampling techniques" means snow ball or BFS sampling ?

weihua916 commented 3 years ago

It means GraphSAGE/ClusterGCN/GraphSAINT sampling---techniques developed for homogeneous graphs