[Graphbolt][Performance] Reduce the memory usage of `preprocess_ondisk_dataset` - Githubissues

dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.

http://dgl.ai

Apache License 2.0

13.53k stars 3.01k forks source link

[Graphbolt][Performance] Reduce the memory usage of `preprocess_ondisk_dataset` #7086

Open czkkkkkk opened 9 months ago

czkkkkkk commented 9 months ago

🚀 Feature

Motivation

Currently, preprocess_ondisk_dataset consumes much more memory than the topology of a graph itself during the preprocessing. When loading a graph with 2B nodes and 8B edges, it cannot be finished in a machine with 380 GB memory. After a rough profiling, I found that the peak memory usage is reached when converting a DGL graph to a fused sampling graph. https://github.com/dmlc/dgl/blob/4ee0a8bddbd93963b5f078c475381f4ab521d2e1/python/dgl/graphbolt/impl/ondisk_dataset.py#L212 There could be two factors contributing to the peak memory usage.

The input DGL graph is passed to the function, which consumes about 160 GB memory.
from_dglgraph creates a temporary homogeneous graph and also its CSC format.

Alternatives

Pitch

Additional context

Rhett-Ying commented 9 months ago

@Skeleton003 could you look into it and try with the new implementation: https://github.com/dmlc/dgl/pull/6986

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you