snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.93k stars 397 forks source link

Efficiently creating undirected graphs in an ogb compatible format #241

Closed jqmcginnis closed 3 years ago

jqmcginnis commented 3 years ago

Hello ogb team!

I have created my own graphs in PyTorch Geometric and would like to convert these to PygLinkPropPredDataset in order to use existing algorithms (based on OGB Datasets) for link prediction tasks.

Creating graphs according to the manual has been fairly straightforward, but I am unsure how to deal with the (un-)directed attribute. If I load the graph as a directed graph (from PyTorch Geometric) and set thedataset_dict[name]['add_inverse_edge'] = True in the make_master_file.py , I get the "RuntimeError: add_inverse_edge is depreciated in read_binary" when loading the dataset.

Alternatively, I could load the PyTorch Geometric Dataset as an undirected graph and store all edges (bidirectionally), setting the dataset_dict[name]['add_inverse_edge'] = False. However, this would lead to an increase in dataset size, and I am unsure if the PygLinkPropPredDataset will understand that the graph is undirected?

What option(s) can you recommend?

Thank you very much for your help!

weihua916 commented 3 years ago

Hi!

We suggest one of the followings: (1) Directly save the undirected graph (with bidirectional edges). (2) Save directed graph and manually apply to_undirected yourself.

For both above, always use dataset_dict[name]['add_inverse_edge'] = False.

Both options will be pretty efficient and are nearly the best you can do. (1) is a bit faster, (2) is more disk-efficient.