viennacl / viennacl-dev

Developer repository for ViennaCL. Visit http://viennacl.sourceforge.net/ for the latest releases.
Other
281 stars 89 forks source link

Full Cholesky Decomposition? #195

Open cdeterman opened 8 years ago

cdeterman commented 8 years ago

I know there is the Incomplete Cholesky Factorization, which I understand is a 'sparse' approximation of cholesky, but is there currently an existing or planned implementation of a full Cholesky Factorization of dense matrices?

karlrupp commented 8 years ago

Yes, currently there is only an LU factorization, no Cholesky factorization. It should be added, yes, but I don't have any particular plans. Contributions welcome :-)

rok-cesnovar commented 7 years ago

Hi, I have a blocked Cholesky implementation in OpenCL that outperforms your existing LU factorization with a speedup of 2-5, depending on the GPU. Would you be interested in my contribution?

Kind regards, Rok

karlrupp commented 7 years ago

Hi, sure, all contributions are welcome! :-) Doall yourblocks need to be the same size? Does it work well across many different block sizes?

rok-cesnovar commented 7 years ago

Maybe the term block cholesky is not the best. My implementation is inspired by this ( https://www.thinkmind.org/download.php?articleid=iccgi_2013_11_30_10133 ) and some other papers. It basically recursively solves the decomposition as described in these 6 steps: screenshot from 2017-05-08 20 34 16

It may not look promising at first glance, but it turns out that is. The speedup does depend on the block size (it does require trying some tuning - manually trying some options). But generally it performs well for almost any block size.

karlrupp commented 7 years ago

Ok, I see. This 'blocked' algorithm is actually similar to how our LU factorization is implemented (and similar to how most other libraries implement it).

rok-cesnovar commented 7 years ago

Ok. If you think this would be of any help to you, I would gladly contribute my work. Do you have any guidelines for adding features or should I just try to replicate the style in which your other features are implemented?

karlrupp commented 7 years ago

Here's a guideline on coding style: http://viennastar.iue.tuwien.ac.at/wiki/doku.php?id=codingstyle I recommend creating a pull request once your contribution is ready for review. If you could also provide a CUDA- and OpenMP-version of your factorization, it would definitely help - but it's not mandatory.

rok-cesnovar commented 7 years ago

Great, thanks!