OHDSI / bayes-bridge

Bayesian sparse regression with regularized shrinkage and conjugate gradient acceleration
https://bayes-bridge.readthedocs.io/en/latest/
18 stars 7 forks source link

[Draft] Add cupy support for sparse matrix cg sampling #15

Closed chinandrew closed 2 years ago

chinandrew commented 2 years ago

Add cupy support for the gibbs() function for sparse matrices using the cg sampling method.

Currently contains the cupy operations to the reg_coef_sampler and cg_sampler modules since those are where the heavy linalg occurs. Theoretically could make it a bit more widespread but I think that'd add more complexity with marginal performance gains.

Probably needs couple tests to verify that errors are properly handled and messaged for failed imports or not implemented sections, and then will remove the draft status.

chinandrew commented 2 years ago

Did you forget ()? Also, how about we make self.model.design._allocate_cupy_matrix() throw a NotImplementedError when the design matrix is dense? It is more explicit. That also avoids having to import SparseDesignMatrix just to type check, thereby breaking the modular design. (Though I probably break that elsewhere.)

Yup I forgot it :facepalm: . I like your suggestion too, I've updated the code now. Also the cupy option should be working with this latest commit. Would be nice to clean up some of the numerous if self.use_cupy, but it should be fully functional as is with a simple gibbs(..., use_cupy=True). The chain initialization is all done on CPU, and GPU is used for the cg sampling.

On gi_bleed, loading the data takes 22 seconds, the chain initialization takes 102 seconds, and the gibbs iterations takes 32 seconds (for 30 iterations).There's a handful of odds and operations in there that take up <10 seconds, for a total run time of 2 min 44s. For reference, two iterations on CPU took 209 seconds, so almost exactly 100x faster per gibbs iteration.

chinandrew commented 2 years ago

Closing in favor of #16