kokkos / kokkos-kernels

Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels
Other
302 stars 95 forks source link

LAPACK Wrappers Request #1642

Open jennloe opened 1 year ago

jennloe commented 1 year ago

Hi Team,

I am working in Belos linear solvers in Trilinos to remove the dependence on Teuchos::SerialDenseMatrix in favor of adding an interface to use Kokkos::Views instead. :) Regardless of how I design the interface, I will still need to call several LAPACK functions which are not directly implemented in KokkosKernels. Thus, my best option currently seems to be grabbing raw pointers to Kokkos::Views and handing those to the Teuchos::LAPACK wrappers (see here) similar to what is already being done in Belos.

It would be nice if we could have Kokkos Kernels wrappers for these instead. In particular, functions I know right now that are needed include: POTRF POTRS PTEQR GEQRF GETRF GETRS GETRI GESV GEEV GESVD GELS GELSS GGEVX LARTG UNGQR SYGV

And I have probably missed a few. Will update this thread when I find them. Thanks!

jennloe commented 1 year ago

Also, it is worth discussing whether it would be best to hide some of the lower-level Lapack details- leading dimension of A, workspace vectors, and so forth.

srajama1 commented 1 year ago

@eeprude You could give adding one of these a shot if you want.

eeprude commented 1 year ago

Thanks for the comment, @srajama1. I will look into it.

lucbv commented 1 year ago

Updating the list a bit based on things we already have implemented:

POTRF
POTRS
PTEQR
GEQRF  --> MueLu has also wanted this for a while
GETRF
GETRS --> I think trsm/trmm implements this but need to dbl check
GETRI  --> combination of already implemented functions, should be easy
GESV  --> already implemented
GEEV
GESVD  --> I think we looked at this already, Paul Kuberry needed it at some point?
GELS
GELSS
GGEVX
LARTG
UNGQR  --> I think there is a PR for this from a student somewhere?
SYGV
lucbv commented 1 year ago

@eeprude maybe start by looking at the dependency graph of these functions and see if we have all the necessary BLAS calls required to implement them. If not we should start by getting the BLAS pieces in first and then move on to the LAPACK pieces. Also we might look at creating a new component if the LAPACK/SOLVER routines grow in number?