dpilger26 / NumCpp

C++ implementation of the Python Numpy library
https://dpilger26.github.io/NumCpp
MIT License
3.52k stars 547 forks source link

np.linalg.eigvals === ?? #143

Open LetGo opened 2 years ago

LetGo commented 2 years ago

hi~ Will you consider supporting np.linalg.eigvals in the future?

oleg-kachan commented 2 years ago

Any progress on this?

dpilger26 commented 2 years ago

It's slated for the next release. A new baby at home has slowed things down though so will probably be another month or two.

DarthData410 commented 1 year ago

Is this still a need? I am looking for a good project to contribute. C/C++ and Python are the two languages I am actually writing a series on for interop on Medium.com. I thought I would see. Same applies to other help wanted open issues here.

dpilger26 commented 1 year ago

Yes, if you've got knowledge in this area I'd be happy to see pull requests.

DarthData410 commented 1 year ago

Ok great... I will fork the repo and get started this weekend. I will message you if I have any questions. We are taking about adding support for this: https://numpy.org/doc/stable/reference/generated/numpy.linalg.eigvals.html

dpilger26 commented 1 year ago

Yep, that's it

DarthData410 commented 1 year ago

Ok so I have spent some time with the implementation of what numpy.linalg.eigvals(x) really is doing. The actual implementation for calculating the eigen values of a general matrix (or numpy.array-ish) object is via the LAPACK library, for the "_geev" routines. LAPACK, written in FORTRAN - but has a C Language API (https://netlib.org/lapack/lapacke.html) that exposes the complex data type returns, as well as the underlying routine of geev for generating eigvals for a square matrix.

All that said - to say - if the goal is the TRULY implement as numpy.lingalg.eigvals(x), then LAPACKE C API would have to be incorporated into NumCpp. Thoughts?

DarthData410 commented 1 year ago

You can see the actual umath_linalg.cpp that numpy.linalg.eigvals(x) calls here: https://github.com/numpy/numpy/blob/8cec82012694571156e8d7696307c848a7603b4e/numpy/linalg/umath_linalg.cpp starting line: 2379 eig_wrapper(...). With this, line 2403 init_geev(...) call. I have a working example of LAPACKE_cgeev(...) just so I could walk through it. Anyway I just wanted to share a bit more around this so a decision on direction could be made.

DarthData410 commented 1 year ago

Ok... So I took a dive through the NumCpp, for example: https://github.com/dpilger26/NumCpp/blob/master/include/NumCpp/Linalg/cholesky.hpp And instead of using the same fortran LAPACK library, NumCpp actually is rewriting the logic achieved with LAPACK subroutines -> surfaced through numpy/linalg/umath_linalg.cpp -> python nump.linalg module. (for example).

I will be honest my focus has been around C++ / Python interop, and can surface LAPACK in C/C++, which I am doing to break down the results, and better understand the subroutine, for say LAPACK "dgeev".

That stated, to achieve what is being offered with numpy.linalg.eigvals(x), using NumCpp and its NdArray object really is a decent size of scope to achieve, if in complete that is.

numpy.ligalg.eigvals(x) asserts the value of N for not to return the Left or Right vectors as part of its scope for generating the eigvals of a general, square matrix. However it will detect the dtype being used as call sgeev for (c++)float/(fortran) real data type returns, or dgeev for double, etc.

For the scope of creating an initial ../linalg/eigvals.hpp be limited to an NdArray or type double(s)?

I'm just trying to get a solid scope of what can be targeted initially, as this is truly a replacement of the LAPACK fortran routines that numpy's C++ actually exposes and uses. Feedback / thoughts?

In short, I don't want to waste time developing something that really should not be the first focus, or even of focus at all.

dpilger26 commented 1 year ago

I had a goal when I started this library to keep it header only, and minimize the dependencies (only dependency is boost, and even that can be turned off for a standalone library). For these reasons I've avoided pulling in LAPACK routines (for better or worse...).

DarthData410 commented 1 year ago

I understand the direction, and decision on the direction. It is best to stick with such design goals in my mind. With that stated I have mapped out the LAPACK Fortran subroutines dgeev (double values) and sgeev (float values) which do the same things, just operate on different IN/OUT linerized matrices (arrays).

I have exposed these LAPACK Fortran routines as extern "C" calls, and then implemented within C++ wrapper. You can find the details of that located here: https://github.com/DarthData410/CppPyInterop/tree/main/lapack_cpp

That is a sub-project of the greater CppPyInterop repo I have been working on for a series of medium.com articles of C/C++ <-> Python interop focus. Two birds, one stone concept.

All of that stated to say, there is a good many subroutines within the call "*geev" LAPACK Fortran subroutines. I have mapped some of them, and exposed a lower level subroutine, drot() for example, just to see if I could call it from C++, within that "clap" example set, which I was successful in exposing and 'test' calling. The goal was to map out what all was being achieved, to implement in C++ a version of what numpy.linalg.eigvals(x) does, within NumCpp and operating off if your buit NdArray object.

It is going to take a little bit to map this out, as there has been a lot of work into those subroutines since the 1970's, as documented within the code itself.

Is it worth the effort to continue is my real question? Because I dont know how long it will take...