SIGN transformation on multiple GPUs for large graphs

ayasar70 commented 2 years ago

🚀 The feature, motivation and pitch

Hello, I was working on SIGN model. In a GPU-based setting, seems that during the preprocessing there can be two bottlenecks; SparseTensor creation and SpMM (https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/transforms/sign.py#L50). Because those operation are going to be on CPUs and of course large graphs cannot fit into a single GPU memory.

Do you think that carrying this computation to multi-GPUs would be helpful? If so, I can work on a CPP extension that takes rows and columns and feature matrix and outputs K layer's SpMM results?

Best

Alternatives

No response

Additional context

No response

rusty1s commented 2 years ago

Yes, these two operations are the bottleneck. I think one may be able to scale that onto GPUs by only considering slices, e.g. only considering a subset of feature columns one by one. This should scale SIGN to both single GPU and multi GPU without the need of a multi-GPU SPMM. What do you think?

ayasar70 commented 2 years ago

Agree. Column-wise slicing can be expensive though. Also using the same approach a multi-GPU SpMM can be implemented. In either case CPU-GPU bandwidth and slicing is going to be the bottleneck. I am investigating this problem using Python/Pytorch based and C++-based solutions. I will update you if I observe good speedup.

pyg-team / pytorch_geometric