pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
84.17k stars 22.69k forks source link

Upgrading SpGEMM algorithm to resolve Cusparse SpGEMM insufficient resources problem #103820

Open qikunxun opened 1 year ago

qikunxun commented 1 year ago

🚀 The feature, motivation and pitch

The SpGEMM algorithm in cuda 11.x version requires high amount of memory for the sparse computation. In CUDA 12, two new SpGEMM algorithms has been introduced to resolve the problem. I really hope that the new algorithms can be integrated to pytorch (Providing a solution to use the new algorithms is also exciting : ) ). Thanks. Please see https://github.com/NVIDIA/CUDALibrarySamples/issues/38.

Alternatives

No response

Additional context

No response

cc @alexsamardzic @nikitaved @pearu @cpuhrsch @amjames @bhosmer @ptrblck

amjames commented 1 year ago

Unfortunately this is not as simple as changing which CUSPARSE_SPGEMM_ALG* flag is used.

Here are a few notes if anyone wants to pick this up:

It would also be nice to have a performance comparison I would assume that if the algorithm is more memory efficient it must sacrifice performance in some way, that would make the heuristic for when to activate it more complicated.