snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

About how to encode Domain index ? #268

Closed JiuhaiChen closed 2 years ago

JiuhaiChen commented 2 years ago

Hi, OGB Team, for ogbg_proteins and ogbl_ppa, i was wondering how you encode the species index into node features? just append the species index into each node feature or append the one-hot encoding? For each graph, since it only belongs to one species domain, do you encode one species index into all node features within one graph? And for ogbg-ppa, ogbg-molhiv, ogbg-molpcba, do you encode the species index into node or edge feature? Thanks !

weihua916 commented 2 years ago

For ogbg-ppa (graph classification), we did not use the species index, as it is annotated at the level of the entire graph. For ogbn-proteins (node classification) and ogbl-ppa (link prediction), you can use species id as the node feature.

The ogbg-mol* datasets are all molecule datasets; hence you do not have the notion of species.