Jutho / KrylovKit.jl

Krylov methods for linear problems, eigenvalues, singular values and matrix functions
Other
284 stars 37 forks source link

eigsolve fails to converge at times #76

Closed PetrKryslUCSD closed 9 months ago

PetrKryslUCSD commented 9 months ago

It appears to be random. Running tests of https://github.com/PetrKryslUCSD/GEPHelpers.jl.git sometimes succeeds, and sometimes fails, because the tests of the :KrylovKit method did not pass.

Not sure how to debug. It might be helpful to have a look at the code: perhaps I'm not using it right? https://github.com/PetrKryslUCSD/GEPHelpers.jl/blob/fbdc383a7b9e57288e82c19e244327e916e14909/src/gep_smallest.jl#L134C1-L134C1

Jutho commented 9 months ago

I will try, but I first want to make sure that I understand the type of problem your solving correctly. Let me see if I get this right:

You want to find the smallest magnitude eigenvalues and corresponding eigenvectors of the generalised eigenvalue problem

K v = lambda M v

where K and M are symmetric/hermitian? I assume M is positive definite, since you want eigenvectors to satisfy sqrt(dot(v, M, v)) = 1? Do you also know that K is positive definite, since you are computing a Cholesky decomposition of K?

PetrKryslUCSD commented 9 months ago

M is pos def, K is semi pos def. I use shifting K -> K + s * M. https://github.com/PetrKryslUCSD/GEPHelpers.jl/blob/fbdc383a7b9e57288e82c19e244327e916e14909/test/test_methods.jl#L18 https://mailtrack.io/link/3b1c8497f11da93192ad01d6f48cd582054ad9e9?url=https%3A%2F%2Fgithub.com%2FPetrKryslUCSD%2FGEPHelpers.jl%2Fblob%2Ffbdc383a7b9e57288e82c19e244327e916e14909%2Ftest%2Ftest_methods.jl%23L18&userId=1197778&signature=0494436fa97fc417

Petr Krysl Prof. and Vice chair for undergraduate education Structural engineering department https://mailtrack.io/link/7ba5ea1c5f7c72b8d8714bfad03e61d352bf435f?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fuc-san-diego-structural-engineering-department%2F__%3B!!Mih3wA!DXYUp152SRRo03xCfQJ9NlAXovNVk-zBYRalwekzmzf2bGwdDMmY8gy9t6iQo9ok2_dvk8m987An93w0pnT_6lM%24&userId=1197778&signature=6d883dfcd25ec1bc University of California, San Diego 9500 Gilman Drive #0085 La Jolla, CA 92093

On Mon, Jan 15, 2024 at 2:00 PM Jutho @.***> wrote:

I will try, but I first want to make sure that I understand the type of problem your solving correctly. Let me see if I get this right:

You want to find the smallest magnitude eigenvalues and corresponding eigenvectors of the generalised eigenvalue problem

K v = lambda M v

where K and M are symmetric/hermitian? I assume M is positive definite, since you want eigenvectors to satisfy sqrt(dot(v, M, v)) = 1? Do you also know that K is positive definite, since you are computing a Cholesky decomposition of K?

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/Jutho/KrylovKit.jl/issues/76*issuecomment-1892808200__;Iw!!Mih3wA!Ht_TVRi5U9ZKC_uzl1Et5Shl8Qd2eFAylZRANAGTOikoIXQu9B_zIg0AVUVrKCpZsqXZEqryRz38HcmTsOugmfjP$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ACLGGWHLIKN2GCRUK55JLSTYOWRHPAVCNFSM6AAAAABB33FXXSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJSHAYDQMRQGA__;!!Mih3wA!Ht_TVRi5U9ZKC_uzl1Et5Shl8Qd2eFAylZRANAGTOikoIXQu9B_zIg0AVUVrKCpZsqXZEqryRz38HcmTsGs4f8NQ$ . You are receiving this because you authored the thread.Message ID: @.***>

Jutho commented 9 months ago

But this means that the spectrum is anyway lying on the positive real axis, right? So there is no need to use a shift and invert method. Eigenvalues closest to zero are extremal, they are :SR ?

If you can afford to compute a cholesky factorisation of K, I assume you can also afford to compute one of M?

Like, suppose M = PtL * PtL' then we can write

K v = lambda PtL PtL' * v

If we now set w = PtL' * v then we can rewrite the eigenvalue equation as

(Ptl \ K / PtL') w = lambda w

where now (Ptl \ K / PtL') is a positive semi-definite (hermitian) matrix. The eigenvectors w_i can then be chosen orthonormal, and the resulting v_i = PtL'\ w_i will automatically be M orthogonal, since

v_i' M v_j = w_i' inv(PtL) M inv(PtL)' w_j = w_i' * w_j = delta(i,j)

So I think you should be able to do

Mfactor = cholesky(Symmetric(M))
PtL = Mfactor.PtL
d, ww, convinfo = eigsolve(
        x -> PtL \ (K * (PtL' \ x)),
        rand(typeof(z), size(K, 1)),
        _nev,
        :SR;
        maxiter = maxiter,
        krylovdim = 2 * _nev + 6,
        ishermitian = true
    )
vv = map(w->PtL'\w, ww)

Am I missing something?

Jutho commented 9 months ago

BTW, here I've opted to implement the linear operator using only operations on the vector (first invert PtL', then multiply it with K, then invert PtL on it). Depending on the number of applications of the linear operator versus the sparsity of K and M, it could be better to simply compute

K2 = PtL \ K / PtL'

and simply feed this as linear operator to eigsolve.

PetrKryslUCSD commented 9 months ago

Alas, no luck with this:

[ Info: Number of eigenvalues 3                                                                                                                                                                                  
convinfo = ConvergenceInfo: no converged values after 300 iterations and 3014 applications of the linear map;                                                                                                    
norms of residuals are given by (0.00016284523568105275, 0.03802466034240979, 0.5512666047261926, 5.524277898549111, 1.8722734469478859, 6.588631471555374, 6.95412419140305, 10.546297011754726, 29.864680928549294).          

Original strategy:

[ Info: Number of eigenvalues 3                                                                                                                                                                                  
convinfo = ConvergenceInfo: 9 converged values after 4 iterations and 51 applications of the linear map;                                                                                                         
norms of residuals are given by (1.3763482481296094e-23, 4.8027455039510326e-23, 1.145182345567586e-23, 2.878358886833472e-23, 6.89564689106194e-23, 3.372296898115525e-22, 2.9381821355032077e-25, 6.029014756331007e-24, 8.775877660725171e-14).  
PetrKryslUCSD commented 9 months ago

The symmetric shift-and-invert seems to be working fine: https://github.com/PetrKryslUCSD/GEPHelpers.jl/blob/0e80c5d2fbfaae2c6ef1b8166ea84208cc7ddccc/src/gep_smallest.jl#L153 Making the standard eigenvalue problem symmetric was a good idea! Thanks.

Jutho commented 9 months ago

This morning, I experimented a bit with the first matrix in your tests (before my first meeting). I think that the convergence issues stem, not from the fact that you are looking for eigenvalues close to zero, since they are still extremal, but from the fact that there seemed to be several degeneracies. There were 6 eigenvalues close to zero, and then the next 2 were also the same, and so forth.

Note that, theoretically speaking, Krylov methods as implemented here (starting from a single vector) can only find one linearly independent eigenvector in every eigenspace, even if an eigenspace is higher-dimensional. This comes from the fact that the original starting vector can be decomposed into contributions belonging to the different eigenspaces, and then Krylov methods act so as to single out one of these contributions.

Sometimes, small fluctuations coming from numerical precision enable a Krylov method to discover a second direction in a higher-dimensional eigenspace after a first eigenvector has converged, but here this did not seem to work well. In this case, the eigenvalues that were found after 300 (but which had not yet converged and still had quite big residuals) indeed matched more or less with the exact eigenvalues, but with only one eigenvalue per multiplet found by eigsolve.

I don't quite understand why shift-and-invert seems less sensitive to this and does allow you to converge the eigenvalues and vectors much better. Probably the fact that at the same time as the multiple eigenvalues at and close to zero, there were also very large eigenvalues in this problem, also plays a role here (I am by no means an expert on the convergence theory of Krylov methods).

PetrKryslUCSD commented 9 months ago

Yes, you are right, this comes from a vibration problem which has a number of symmetries (it is a perfect unsupported cube).

Jutho commented 9 months ago

Ok. So just a warning that finding all eigenvectors in a higher-dimensional eigenspace might be tricky for ordinary Krylov methods. It works often thanks to finite numerical precision, but that can be quite sensitive, as theoretically it is not supposed to work. It might be that you need some block-Krylov methods to robustly find all degenerate eigenvectors.

PetrKryslUCSD commented 9 months ago

Right, but Arpack is also in that category, isn't it? And it does fine.

Petr Krysl Prof. and Vice chair for undergraduate education Structural engineering department https://mailtrack.io/link/15cf79f6d2a97845ec1d184d794cb6be2269cfd9?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fuc-san-diego-structural-engineering-department%2F__%3B!!Mih3wA!DXYUp152SRRo03xCfQJ9NlAXovNVk-zBYRalwekzmzf2bGwdDMmY8gy9t6iQo9ok2_dvk8m987An93w0pnT_6lM%24&userId=1197778&signature=5d8bf8dbecbb55fc University of California, San Diego 9500 Gilman Drive #0085 La Jolla, CA 92093

On Tue, Jan 16, 2024 at 2:47 PM Jutho @.***> wrote:

Ok. So just a warning that finding all eigenvectors in a higher-dimensional eigenspace might be tricky for ordinary Krylov methods. It works often thanks to finite numerical precision, but that can be quite sensitive, as theoretically it is not supposed to work. It might be that you need some block-Krylov methods to robustly find all degenerate eigenvectors.

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/Jutho/KrylovKit.jl/issues/76*issuecomment-1894642873__;Iw!!Mih3wA!Hwtxka0j2cpjTyieE9ORXo5RubZvH9BVqh5qPkfrKUfibclDlkNH94ri2dlnfY3Jb7LqoDcMTRycDGWVqqE7rHJH$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ACLGGWAROAFK5WRVDK35AH3YO37N3AVCNFSM6AAAAABB33FXXSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUGY2DEOBXGM__;!!Mih3wA!Hwtxka0j2cpjTyieE9ORXo5RubZvH9BVqh5qPkfrKUfibclDlkNH94ri2dlnfY3Jb7LqoDcMTRycDGWVquQ6D3-Y$ . You are receiving this because you modified the open/close state.Message ID: @.***>

Jutho commented 9 months ago

So Arpack works without needing shift and invert? I should look into this. I do know of one "advantage" that Arpack has, which is that it can in principle generate new random vectors if the starting vector generates an invariant subspace that is smaller than the Krylov dimension. (KrylovKit, being completely agnostic about what type of vector you are using, does not know how to create new random instances of the vector type). But that should not be relevant here.