snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

Standard edge-label prediction datasets #422

Closed LucaCappelletti94 closed 1 year ago

LucaCappelletti94 commented 1 year ago

Hi all,

I am looking for standard edge-label prediction datasets to evaluate a new model.

If I am not mistaken, this task is not yet covered by OGB.

The BBOP team has developed several KGs and ontologies which may be used for such a task, as most of them have edge labels.

Currently, 800 different KGs versions are available from KGOBO and KGHUB, with 211 unique graphs. I hope that among these we could possibly find a few that are compatible with your data quality standards.

A knowledge graph I believe may be up to this task is KGCOVID19, which you can readily retrieve by running (after installing grape with pip install grape):

from grape.datasets.kghub import KGCOVID19

kg = KGCOVID19()

This will download the edge list from kghub. I suggest using the aforementioned grape library so you will get a comprehensive graph report to better evaluate whether this (or other) available KGs may be fit as a standard edge-label prediction benchmark dataset.

I hope it will be possible to readily agree on some standard dataset for edge-label prediction.

I haven't found a discord server for OGB, but if you'd like to quickly talk about this I am available on GRAPE's discord server.

Best, Luca

weihua916 commented 1 year ago

Hi Luca,

Thanks for reaching out! You are right that we do not have edge-label prediction datasets in OGB, so we welcome dataset contribution here! Here is the instruction how you can contribute some of the datasets! Let us know if you have any further questions.

Best, Weihua