Open joamatab opened 6 months ago
Is there a cuda compatible sparse matrix solver we can use in stead of klu here?
yes, @bdice used cupy for that
where can we find some benchmark code for it?
it would be great to compare CPU to GPU
Hi @joamatab and @flaport -- first, thank you for your time. I met with @joamatab at PyCon as part of the Accelerated Python sprint. We discussed using a CUDA-based backend for this library. CuPy seemed like the easiest choice.
Here's a brief rundown of what this PR contains:
The new "cuda"
backend requires cupy
but it is not set as the default, because it is not compatible with JAX JIT and thus cannot be used for optimization. I don't know how crucial JAX JIT / differentiable backends are for the problems you typically use here.
The "cuda"
backend uses CuPy, which calls into cuSolver. The cupyx.scipy.sparse.linalg.spsolve
function does not support batched sparse solves. It seems like this is a common use case in sax
-- but the only solution I have for now is to use a raw for
loop, which may not be ideal for performance. There may be future CUDA libraries that serve this use case with a fully-batched solver, which would be able to provide further acceleration on batches of smaller sparse matrices.
The main use case I would see for this backend is to enable sparse solves when you have a small number of very large sparse matrices. For a performance evaluation, I would try this with a very large sparse matrix. I wasn't able to find an example for benchmarking this in the repository, so I haven't pursued that any further.
I also ported some of the test/example notebooks into proper tests. Running the quick start notebook caught some errors in the CUDA backend, which were easily fixable but were not covered by the existing tests. I also expanded the tests to compare the CUDA and KLU backends for the sample data provided in a test notebook.
It was great to meet @joamatab and I hope this is helpful -- I won't be able to commit significantly more time here, except to address some PR reviews. If you try it out and see good (or bad) performance, please let me know! I am interested in seeing how it performs on large sparse matrices. Please feel free to give it a try. If you try it and find it's not worth adding for any reason, I won't be offended if you close the PR. It was fun to learn about this solver and the problems you're using it for!
Best wishes to you, and thanks for maintaining this as an open-source project!
Hi @bdice , Thank you so much for your contribution. Adding a CUDA backend has been something I wanted for a long time! I'm currently in a big move so I won't have much time to review this week, but rest assured that this is one of the first things on my todo list next week! I'll also add a benchmarking suite for future reference. I'm interested to see where it lands :)
inspired by
@jan-david-fischbach @Vivswan @flaport @bdice