snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.93k stars 397 forks source link

mapping between obgn_arxiv and titles in the tsv files #222

Closed paulgay closed 3 years ago

paulgay commented 3 years ago

Hi,

I am trying to figure out how the obgn_arxiv nodes and the titles in the tsv file are related

The titleabs.tsv file has 179720 lines and the graph has 169343 nodes (graph.x.shape == [169343, 128])

I got that the fields of the tsv files are Mag id, titles, abstract but I could not find the mag id in the dataset object.

I loaded the dataset with :

from ogb.nodeproppred import PygNodePropPredDataset 
d_name = "ogbn-arxiv" 
dataset = PygNodePropPredDataset(name = d_name)  
graph = dataset[0] # pyg graph object

Am I missing something there?

paulgay commented 3 years ago

Ok just find the mapping in the downloaded folder which contains the needed information. Closing the issue.