Eigensolver support in Ginkgo

gflegar commented 6 years ago

Introduction

This is an issue to discuss how eigensolvers could be added to Ginkgo.

In general, eigensolvers compute an eigendecomposition A Q = Q Λ (or equivalently A = Q Λ Q^-1) of a matrix A, where Q is a square matrix whose columns are the eigenvectors, and Λ is a diagonal matrix containing the eigenvalues of the matrix.

The basics

Here is a reminder of some basic linear algebra concerning the eigendecomposition, which might have to be considered in the discussion (please add anything that is missing and should be considered):

not all matrices have an eigendecomposition - the closes thing you can get is Jordan normal form, though AFAIK this is never used in numerics
even if a real matrix has an eigendecomposition, the corresponding Λ (and Q) do not have to be real
if A is normal (A^*A = AA^*), the decomposition exists and Q is unitary.
if A is Hermitian (A^* = A), Λ is real (and since Hermitian => normal, Q is unitary)
if A is real symmetric (A^T = A), Λ is real, and Q is orthogonal.

Numerical methods

The first step for deciding on the interface is to see what the methods actually do; what inputs they need, what outputs they provide, and how they compute it. If anyone knows a good online reference of various methods, please link it here, I don't know of one in English. I'll try to quickly go over the basic methods (using my very limited understanding of eigenvalue algorithms) to give a starting point.

While writing this section I found that there is Templates for the Solution of Algebraic Eigenvalue Problems. Not sure how good it is, probably the same style as the templates for linear system. It's in HTML format, so not very pleasant to read, and I did not find a (non-shady) PDF available.

Rayleigh quotient

The first observation that can be made is that it's all about the eigenvectors. If we know that q is an eigenvector for A, even if the corresponding eigenvalue λ is not known it can easily be computed from the equation Aq = λq. Multiplying the entire thing by q^* and dividing the result with q^*q gives us the formula λ = (q^*Aq) / (q^*q) (which is just SpMV + 2 dot products). Note that the quantity R(A, q) = (q^*Aq) / (q^*q) is meaningful even if q is not an eigenvector, and is called the Rayleigh quotient. In the context of eigendecomposition algorithms to approximate the eigenvalue from an approximation of the eigenvector.

Power iteration

The power iteration is probably the simplest of all, and can only compute the eigenvector associated to the largest by magnitude eigenvalue of the matrix. The method is basically a fixed-point iteration with f(q) = Aq / || A q ||:

def power_iteration(A, q0, has_converged):
    """
        A:  input matrix
        q0: initial guess for the eigenvector
        has_converged: a function which determines if the approximation is good enough
    """
    q = q0 / norm(q0)
    while not has_converged(A, q):
        q = A * q
        q = q / norm(q)
    return q

Very intuitive proof of convergence: The linear operator A = Q λ Q^-1 acts on any vector as "scaling" along directions q₁, ..., q_n by λ₁, ..., λ_n, respectively. Assuming λ₁ is the unique(!) largest eigenvalue by magnitude, the scaling of the vector in that direction will be larger than in any other directions, thus the contribution of the component in the direction q₁ will increase relative to other directions. After enough iterations, all other component will die off, and we're left with q₁. Notice how this doesn't work if there are multiple eigenvalues of the largest magnitude, or if the initial guess does not have a component in direction q₁. The convergence rate is determined by the ratio between the largest, and the second largest eigenvalue.

Design questions

Q1: What can has_converged be, and how do we pass it? One possibility is *|| Aq - R(A,q) q || < tol R(A, q) for some tolerance level tol. Is this very distinct from stopping criteria we have for solvers? Are there cases where the eigensolver should have a stopping criterion of a solver (and is there a meaningful definition of rhs b**), or vice-versa, a linear system solver which should stop when a good enough approximation of an eigenvector (of something) is found? Q2: This method only computes one eigenvector, not the full decomposition. How do we represent this?

(Shifted) inverse iteration

If we apply the power method to A^-1, we get the eigenvector (of A^-1) associated to the largest eigenvalue 1 / λ_n of A^-1. Directly from the definition it follows that that same eigenvector is the eigenvector of A associated to its smallest eigenvalue λ_n. Of course A^-1 is never formed explicitly, but its application is realized as a solution of a linear system.

That's largest, and smallest eigenvalues. The next step is to look into the operator A - zI (where z is a scalar). Its smallest eigenvalue is the smallest among numbers λ_i - z (i.e. the λ_i closest to z), and the associated eigenvector once again is also the eigenvector of A associated to λ_i. Thus, applying the power method to (A - zI)^-1 (solving a shifted system at each step) results in an eigenvector associated to the unique(!) closest eigenvalue to z. This method is know as inverse iteration. There's also a related method called Rayleigh quotient iteration, which replaces z with the Rayleigh quotient of the current approximation at each step. There is also an example implementation of this method in Ginkgo, under examples/inverse_iteration.

Design questions

Q3: The stopping criterion for inverse iteration can be expressed in terms of the original problem, as in "how far is my approximation from an eigenvector of A", or in terms of the problem we are actually solving, as in "how far is my approximation from an eigenvector of (A - zI)^-1". Do both of these criteria make sense? Is there even a difference between the two, or do they always give the same answer? Q4: Would we even need a separate implementation of the inverse iteration method, or could we just express it as the power method applied to the operator (A - zI)^-1? We already have the possibility to form this operator in Ginkgo: the shift can be expressed using gko::Combined, and the inverse with any solver. Q5: Depending on the other answers, do we need the possibility to have a method that operates on one operator, but uses a stopping criterion associated to another operator? If we implement inverse iteration in terms of the power method, the operator used for computations is (A - zI)^-1, and the operator used for the stopping criterion might be A.

Subspace (or orthogonal) iteration

This is another improvement of the power method, which can compute the eigenvectors associated to the largest p eigenvalues of A. The idea is to iterate a p-dimensional subspace, instead of just a single vector, which will then converge towards an invariant subspace of A. The basis has to be re-orthogonalized at each step, as otherwise all the vectors will converge towards the largest eigenvalue.

Further refinement of the method in the dense case for p = n results in the QR algorithm and implicit QR algorithm. Don't know if something similar can be done in the sparse case.

What do more advanced methods (especially for sparse matrices) look like? It would be great if someone could give general description of them, so we can discuss possible designs.

FEAST

This is a method I'm looking into right now, but not yet in a lot of detail. What I can contribute here is that it needs to solve shifted systems in each iteration, for different shifts. What it also needs is numerical integration, and a dense eigensolver, as it reduces a large sparse problem, to a small dense problem.

Design questions

Q6: What do we do with all the extra functionalities that are needed for some methods? Do we implement everything ourselves? This would basically mean that at some point we would need our own implementation of the whole BLAS and LAPACK, with optimized kernels for everything. We would also need to create a sub-package for numerical integration, etc. If we use third-party software, what do we use? What do we use for different executors? Note that this is not only tied to eigensolvers, but also to linear solvers. For example, GMRES already needs some orthogonal transformation, which cold be provided by another library to remove the burden from us. I assume we'll have more advanced methods where this will become a necessity. Q7: If we decide we do need other libraries, what do we do with support for user-defined value types? Are there libraries that have this possibility we can use? If not, what should we do?

Applications

I would really want to have something in this section, but my knowledge here is even worse than in the previous section. How are eigensolvers used in practice? It should affect the design, as we want to provide an interface that provides the results in a format that feels natural for applications. If someone could contribute this section, or knows someone who can contribute to this section, that would be great...

gflegar commented 6 years ago

Here is my rough idea for now, given the information I have. Since an eigensolver produces a (partial) eigendecomposition, we could treat it the same way as we plan to treat other decompositions. It could be a LinOpFactory, which takes A as input and produces the "eigendecomposition LinOp" as output. The eigendecomposition operator is just a Combined operator which contains the factors of the eigendecomposition. You can get the factors from the Combined operator. For a full decomposition the result represents exactly the same linear operator as the original matrix A, but computes the result in three steps. This is probably not very useful, except that you could theoretically solve systems with it directly. For a partial decomposition, it could be viewed as a low-rank approximation of the original matrix, which might be useful for some applications. For example if L is the operator representing the partial decomposition of A, then A - L replaces the eigenvalues of A found in L with 0, or by changing the eigenvalues of L we can shift the corresponding eigenvalues of A to whatever we like. Not sure if this corresponds to real applications though.

The idea of Combined and how it can be used to express various things is explained in more detail in issue #26, and the support for it was added in #132.

Here is a copy of the discussion from #133, where I explain it in a bit more detail, and raise some questions that should be answered:

If you look at the eigenvalue decomposition A V = V L, then A = V L V^-1, which means you could treat it as any other factorization algorithm, which will be represented as factories that produce a combined operator containing the factors. That's easy for normal matrices (A^*A = AA^*), where V^--1 = V^*, so there's no additional computation. If V is not full, but still normal, we could also do A -> V L V^*, where the resulting operator represents a low-rank approximation of A.

The only problem is what to do if the matrix is not normal, so V does not have to be orthogonal. Then, one could do a low-rank approximation as A ~ V L V^+, but computing the pseudo-inverse might be expensive. We'll have to think about it more when we decide to support such matrices.

Concerning inverse iterations specifically, not sure we want to have it - it's not a very robust algorithm. It can only find 1 eigenvalue, and only if it's unique. Do people actually need this in practice, or would you always do something a bit more advanced? The natural next step would be to iterate a multidimensional subspace (leading to QR iterations if adding the full domain, not sure if there's an equivalent for sparse).

tcojean commented 6 years ago

Thanks for taking the time to write all of this, this is useful resources I will need more time to proceed it properly.

I wanted to mention that this book may be of interest: https://www-users.cs.umn.edu/~saad/eig_book_2ndEd.pdf, some methods you exposed can be found here (notably in the chapter 4).

gflegar commented 6 years ago

Btw, everyone should feel free to edit the description of the issue however they see fit (think of it as a wiki page). The idea of it is to provide a general introduction to the topic, and point out the design issues, so people can follow the discussion in the comments.

The comments can be used to propose and discuss possible designs.

pratikvn commented 6 years ago

For the different algorithms, I think ARPACK is a pretty good source. The original library is in FORTRAN, but they have a beta version, ARPACK++ in templated C++.

gflegar commented 6 years ago

Do they have support for different architectures, e.g. GPUs?

EDIT: Ah, you're talking about an implementation we could use as a reference, correct? I though you were answering Q6.

gflegar commented 6 years ago

Last update: May 27, 1998.

Did they forget to update the update date? :confused:

pratikvn commented 6 years ago

Hm, they dont seem to have a GPU version. I think that is the actual last update, but I am not sure. the user guide says 2001. :|

yhmtsai commented 6 years ago

For FEAST and contour integral based eigensolver, this paper shows the relationship among the several contour integral based eigensolvers. These solvers not only solve the standard eigenvalue problems but also the generalized eigenvalue problems.

For using contour integral based eigensolver, we need to estimate the number of eigenpairs. Give a region containing 100 eigenpairs, but we apply FEAST on n x 10 subspace and then FEAST does not get any correct eigenpair. These solvers need 'enough' initial subspace, so we can use the estimated number of eigenpairs to randomly generate the initial subspace.

yhmtsai commented 6 years ago

For the application part, the sparse eigensolver can be used on the organic material simulations. Numerical aspect of large-scale electronic state calculation for flexible device material Note. the eigenvalue problem is generalized eigenvalue problem. They only need the information in the specified region, but they use the dense eigensolver eigenkernel to solve the whole eigenpairs. It needs a lot of nodes to solve one problem, but contour integral based eigensolver can use one node to solve it with better performance. They use eigenvalue and corresponding Participation Ratio (PR) which is calculated from the eigenvector to analysis the type of material.

yhmtsai commented 5 years ago

I add the needed subroutines from FEAST or block SS-RR here. If we would like to implement these Eigensolvers in ginkgo, we should implement these subroutines first. For symmetric version:

[x] sparse matrix multiply dense matrix
[x] dense matrix multiply dense matrix
[x] dense matrix scale
[x] vector dot
[x] vector norm
[x] sparse matrix axpy (it is zB-A in these eigensolvers)
[ ] solve the eigenpairs of the symmetric generalized eigenvalue problem.
[x] solve unsymmetric sparse linear systems on multiple right hand side

The sparse matrix axpy is maybe not needed because it only needs the operation when solving the linear system. Defining a LinOp which represents zB-A is also a workable route.

yhmtsai commented 5 years ago

I would consider the eigensolver like linear system solvers. Eigensolvers have the stop criterions. The sparse matrices usually compute part of eigenpairs in my understanding, so the eigendecomposition will be not the correct decomposition. Moreover, the generalized eigenvalue problem might be not a decomposition of A, B. In my imagination,

solver_gen =
     power_method::build()
     .with_criteria(something)
     .on(exec);
auto solver = solver_gen->generate(A, B);
// x is the initial guess if it is needed. x will be the eigenvectors.
// b is the eigenvalues.
solver->apply(lend(b), lend(x))

hartwiganzt commented 5 years ago

Humm... not yet clear to me. If we want to express an eigensolver as a linear operator x = A(b), we would need the eigenvalues (b) before computing the eigenvectors?

yhmtsai commented 5 years ago

oh, I forget the linear solver will take the b as the input. Eigensolver does not need b before computing eigenvectors. For the iterative method, the input is only initial guess x. op(A, B)x or op(A)x => get x and b For the full decomposition, eigensolver does not need initial guess. op(A) or op(A, B) => x, b

yhmtsai commented 5 years ago

How about do we create an Eigensolver class with stop criterion? Using eigensolver as LinOp is something strange. (ex. op(A) on x not b)

thoasm commented 5 years ago

I did not quite follow the whole Issue (and have probably the least amount of knowledge about Eigensolvers from all of us), but from what I understand, this Issue was created to discuss the interface. If I understand @gflegar in one of his first replies correctly, then he proposed to make a LinOpFactory, where you call generate(A) to get a Combination, which basically contains both the Eigenvectors and the Eigenvalues. It would look like this in the end:

eigen_factory =
     power_method::build()
     .with_criteria(something)
     .on(exec);
// Generate Eigenvalue & -vector combination for matrix A
auto eigen_combination = eigen_factory->generate(A);
auto eigen_values = eigen_combination->get_coefficients();
auto eigen_vectors = eigen_combination->get_operators();

This is at least how I understand his comment, and to me, it also looks like a nice and intuitive interface.

yhmtsai commented 5 years ago

thank @thoasm

Should LinOpFactory be able to apply? I do not know how the apply operator of the generalized eigenvalue results looks like.

auto eigen_combination = eigen_factory->generate(A, B, X); X will be changed during the process, and X will be the result eigenvectors. It is different than linear solver, so I am not sure whether it is okay

thoasm commented 5 years ago

Do you only want to generate the Eigenvectors, or also the Eigenvalues? If you don't need any return value, then my approach does not work since I wanted to return all the results. Also, the semantics would be wrong since generate would actually not generate anything, so using a Factory would not work in general.

If you are fine with returning the result (I think that returning a Combination is very well suited), you can also have an optional parameter which would be the initial guess for X. The final Eigenvectors would however be created separately and returned as the operators of the Combination. After the generate, there is no need for a self-implemented apply since all calculations would happen inside the generate step.

Just so we are on the same page, a Combination(defined in include/ginkgo/core/base/combination.hpp) contains a linear combination of multiple linear operators, e.g. c1 * op1 + c2 * op2 + ... + ck * opk. I think the coefficients c1, ..., ck would be a good fit for the Eigenvalues, while the operators opk would then represent the Eigenvectors.

yhmtsai commented 5 years ago

Combination is suitable for storing the vectors and value but c1 * op1 + c2 * op2 + ... + ck * opk is meaningless. In the block/matrix version of eigen algorithm, they use some kind of A*E (E is for the eigenvectors/eigensubspace) I do not want to eigensolver be one of LinOp because ginkgo's functions use LinOp as its input parameter. If eigensolver is one of LinOp, we may put the eigensolver into these functions without errors during compile. eigensolver should be similar with solver but it is not LinOp in my thought. We still can create the eigen factorization of one matrix which only calls the corresponding eigensolver of standard problem.

auto eigen_factory =
     power_method::build()
     .with_criteria(something)
     .on(exec);
auto eigen= eigen_factory->generate(A, initial parameter);
// or eigen= eigen_factory->generate(A, B, initial parameter);
eigen->produce(X)
auto eigen_values = eigen->get_eigenvalue();
// or eigen_values = eigen->produce(X)

hartwiganzt commented 5 years ago

If eigensolver is one of LinOp, we may put the eigensolver into these functions without errors during compile.

That actually is a valid point. Someone may use an Eigensolver as a preconditioner, and the code will compile without problems. We may think of having a different class, maybe... probably @pratikvn @tcojean @thoasm @gflegar can all give their opinion?

thoasm commented 5 years ago

I definitively agree that it should not be a solver since apply does not fit here. Since you said that the Combination structure (c1 * op1 + c2 * op2 + ... + ck * opk) is meaningless, it also makes sense to not build on top of it.

Currently, I think that a LinOpFactory or a Factory in general might be good here in combination with a self-defined class which stores the Eigenvalues and Eigenvectors separately. I would prefer if we could keep the result as a LinOp, however, that kind of requires a well defined apply on the final product. Does it make sense to store a Composition of Q and Λ, so an apply(b, x) would basically mean: x = Q Λ Q^-1 b That would be very similar how we handle the ILU while still having a LinOp. To me, that would make sense, however, my experience with Eigensolvers or the usage of the result is almost non-existent, so you have to decide if that can make sense, or if it is also useless.

For that, we probably need a Diagonal matrix format to store the Eigenvalues efficiently, but I think that is fairly straight forward (might need some time to implement however, since there are a lot of conversions to do).

However, I am against forcing the Eigensolver to be a LinOp while it can not act as one, that can lead to confusion to the user and result in very annoying errors, for example if somebody wants to use it as a preconditioner.

Edit note: Corrected wrong apply equation.

gflegar commented 5 years ago

I agree with @thoasm's last comment.

Making an eigensolver a LinOp definitely does not make sense, simply because the result of the operation is definitely not an application of some linear operator. For starters, linear operators operate on vectors (sure, we pass a matrix there, but only to represent multiple, independent vectors), while eigensolvers operate on matrices.

As I suggested previously, it may be a good idea to represent an eigensolver as a LinOpFactory, which would then generate() the (partial) eigendecomposition, with each factor being a LinOp. BTW, a small correction of @thoasm 's last comment: since the eigendecomposition is A = Q Λ Q^-1, then applying the resulting operator would effectively do x = Q Λ Q^-1 b, and not x = Q Λ b (consistently with what ILU would do). However, there are certain trade-offs with this solution. I know enough about eigensolvers to ask the questions, but not enough to answer them, so here they are (also, take a look at all the questions in the original post, and try to answer them):

Is this a useful representation? The solution does fit the concept of LinOpFactory, but will anyone ever use it as one? I.e. will there ever be a need to pass the eigensolver factory into something else in the library that expects a LinOpFactory?
Do we have solvers/applications where we only need the eigenvalues, and not the eigenvectors? If so, then building it as a composition may not be such a great idea. (There is a workaround of still returning the decomposition, but with the LinOp representing Q lazily computed, only if someone requests it. This then starts to look very ugly, and is most likely not very useful).
What do we do about generalized eigenvalue problem (B^-1A = Q Λ Q^-1)? How do we pass in B? One option would be that the user somehow builds the operator B^-1 A and uses it as an input for the eigendecomposition factory, which then somehow figures out that this is a generalized eigenvalue problem based on the structure of the input operator. It starts getting messy at this point.

In conclusion, if there is no real use-cases for having eigendecompositions as a LinOpFactory, it may be a better idea to build a separate hierarchy for them. Still, you would need to think how to best encapsulate all the different variations of the eigendecomposition problem (doing full decomposition vs only some eigenpairs, computing only eigenvectors or only eigenvalues, finding a generalized vs simple eigendecomposition). There's no single answer here, it's probably best to think about the use-cases, try out different designs, and choose the one that has the best trade-off between brevity, extensibility, and being "In the spirit of Ginkgo".

yhmtsai commented 4 years ago

@nghiakvnvsd We are still discussing the eigensolver structure/interface. Do you have any use cases? It will be very helpful to make sure the design is suitable for application. The following is current structure. We do not put the eigensolver as LinOp directly but using the result to get corresponding eigenvector/eigenvalue or eigen decomposition.

template <ValueType>
class eigen_base {
    public:
        // maybe need some restriction to choose the satisfied eigenpair
        // default depends on eigensolver
        get_decomposition(solver_factory, norm_criterion);
        get_eigenvector(norm_criterion); // maybe lazy operation can be computed by Gauss elimination from eigenvalue.
        get_eigenvalue(norm_criterion); // maybe lazy operation can be computed by Rayleigh quotient from eigenvector.
    protect:
        compute_eigenvector_impl(); // Gauss elimination (could be overridden) maybe only on dense? Solve (A-DB)x = 0 nontrivial sol)
        compute_eigenvalue_impl(); // Rayleigh quotient implementation (could be overridden); x'Ax/x'Bx
    private:
        LinOp A;
        LinOp B (maybe be nullptr or identity);
        Dense E;
        Diag D;
        norm_criterion;
}
eigen_solver: EnableEigenBase {
    factory {
        ....
    }
    generate(A); // B is I for generalized eigensolver
    generate(A, B); // It will handle A_ = B*A eigenvalue (or not-implemented) on standard eigensolver
    // the eigenvector or eigenvalue could be lazy operation depending eigensolver algorithm.
}

Usage:

factory->generate(A, B); // eigen_solver result
// the following will delete internal eigen_base
factory->generate(A, B)->get_eigenvector(); // Dense
factory->generate(A, B)->get_eigenvalue(); // Diag
factory->generate(A, B)->get_decomposition(solver); // Composition
// A = BED[solver on E] (BEDE^-1) on generalized eigensolver
// A' = BA = EDE^-1 on standard eigensolver

LOBPCG:
parameter:
  subspace (number or given subspace)
  criterion
lobpcg->generate(A, B); // the largest eigenpairs.
lobpcg->generate(A-alphaI); // the shift method. the eigenvector closest to alpha
lobpcg->generate(solver(A-alphaI)); // shift-inverse method

FEAST:
parameter:
  subspace (number or given subspace)
  center, radius (only compute the eigenpair in this circle)
  criterion
FEAST->generate(A, B); // the needed eigenpairs.
// estimating the number of eigenpair in desired region
// maybe just function is enough?
// auto estimated = contour_based_estimate(A, B, region, subspace=nullptr);

We are stilling working on it, so this interface might not be the last version. Any comment is welcome

ginkgo-project / ginkgo