dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.58k stars 3.02k forks source link

[RFC] DGL + cuGraph #2861

Open jermainewang opened 3 years ago

jermainewang commented 3 years ago

🚀 Feature

RAPIDS cuGraph is a high-quality library for GPU accelerated graph algorithms (see the full list of supported algorithms). As many of these algorithms are backbone of graph ML, we see more and more GNNs utilizing them as building blocks. Currently, DGL only provides CPU implementations for a limited number of them, which forces users to either copy graphs between CPU and GPU frequently, or write these algorithms from scratch. This proposal wishes to initiate the discussion on how to combine the merits of both packages.

Some of the ideas come from the short discussion between @jermainewang and @jeaton32 in Mar. 2021.

Motivation

The RFC is motivated by real use cases and community requests. Here, I listed some of the notable ones.

Accelerate GNNs with graph traversal

The examples are

The community also has been asking for more efficient implementation (see issue). Currently, DGL provides a very limited set of graph traversal routines and they only have CPU implementation. Besides, all these traversals require batch support (i.e., performing multiple traversals simultaneously) which, as confirmed by Joe, aligns with cuGraph's roadmap.

Accelerate GNNs with mini-batch generation

Mini-batch training is an important topic in GNN research. Current training pipeline contains two steps: (1) perform sampling on the input graph to generate mini-batches in the form of (smaller) subgraphs and (2) compute message passing on the samples. There have been many evidences about sampling being the major bottleneck in this pipeline. Recently in DGL, we have made some initial efforts to move some costly sampling step to GPU (see PR #2716). It is interesting to see whether we could utilize the rich subgraph extraction APIs from cuGraph to further speed it up. One example is the IGMC model which needs a different type of mini-batches which is essentially an EgoNet extraction.

Accelerate GNNs with random walk

Random walk is commonly used in network embedding models like Deepwalk, node2vec and metapath2vec. As random walk is already on cuGraph's roadmap, it is interesting to see how it can further improve these models.

Accelerate low-level operators

These operators are not user-facing but are widely used in the system as backbone. One example is renumbering which is commonly used after subgraph extraction to compact node/edge ID space.

Pitch

Additional context

In terms of technical feasibility, the two projects have already aligned in many aspects accidentally. First of all, both frameworks support DLPack practice, making it easier to exchange array-like memory without copies. Second, cuGraph aims at providing APIs similar to networkx's, which is also one of the initial design considerations of DGL.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you