GEOS-DEV / GEOS

GEOS Simulation Framework
GNU Lesser General Public License v2.1
203 stars 80 forks source link

Extend BLAS interface #3173

Closed dkachuma closed 5 days ago

dkachuma commented 2 weeks ago

Extends the BLAS/LAPACK interface to include linear solves where the solution and rhs are matrices. The addition is a function to solve $$AX=B$$ where $A$ is an $n{\times}n$ matrix and $X$ and $B$ are $n{\times}m$ matrices.

This also adds an "in-place" version of solveLinearSystem. This takes the matrix $A$ and the rhs $B$ only. In using this, the caller is allowing LAPACK to destroy both the $A$ and $B$. On exit $A$ is replaced by an LU factorisation of $A$ and $B$ is replaced by the solution (in true LAPACK style). This has the advantage of not requiring extra allocations of memory within the call.

For the row-major version, I couldn't figure out how to call BLAS/LAPACK. So I was forced to allocate space anyway for the transposition of the solution matrix $X$. Two exceptions to this are 1) $m=1$ the vector case in which case row-major vs col-major is meaningless and 2) $m=n$ in which case $X$ can be transposed in place.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 55.89%. Comparing base (f05008b) to head (376b61d).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #3173 +/- ## =========================================== + Coverage 55.77% 55.89% +0.11% =========================================== Files 1035 1037 +2 Lines 87914 88148 +234 =========================================== + Hits 49035 49271 +236 + Misses 38879 38877 -2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dkachuma commented 2 weeks ago

It seems that lapacke follows this same approach and there's nothing much we can do, see:

I actually looked at lapacke to get some inspiration and was gravely disappointed to see that they just copy. I also considered in-place transposes but they are so complicated and quite unreadable.

victorapm commented 2 weeks ago

Yes, avoiding the copy would be great... Hmm, if this is to be called inside a FEM kernel or, generally speaking, called multiple times, I guess we could allocate the extra work space only once outside the solveLinearSystem function. At least, we get away with constant alloc/dealloc this way. Just sharing the idea...

dkachuma commented 1 week ago

Yes, avoiding the copy would be great... Hmm, if this is to be called inside a FEM kernel or, generally speaking, called multiple times, I guess we could allocate the extra work space only once outside the solveLinearSystem function. At least, we get away with constant alloc/dealloc this way. Just sharing the idea...

Wasn't sure why kind of threaded code this is called from. Didn't want to introduce something that might complicate that. But I'm sure it can be done. Anyhow the current uses of this are only 1d so we shouldn't be hitting this issue yet.