snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

Raw text features for OGBN-MAG dataset #469

Open VeritasYin opened 5 months ago

VeritasYin commented 5 months ago

Hi, I was wondering if there is a way to access the raw text features of paper nodes in ogbn-mag dataset. I tried to fetch their metadata from papers100M dataset, but only 600 of 73.6K got matched based on entity_id provided in paper_entidx2name.csv. Meanwhile, the MAG service is retired, and we can no longer access the MAG from their website. Is there any chance the team still keeps the metadata or any alternative ways to obtain them? Btw, the entity ids of papers provided under the mapping folder correspond to OAG v1 or v2? Is it consistent with the entity name from the papers100M dataset? Thanks!