Generalized EP: convergence sometimes lacking

PetrKryslUCSD commented 8 months ago

An eigenvalue problem is solved with Arpack and ArnoldiMethod. The eigenvalues agree, the eigenvectors do not.

#####################################################                                                                                                                                                            
# unit_cube_modes                                                                                                                                                                                                
Vibration modes of unit cube  of almost incompressible material.                                                                                                                                                 

Eigenvalues: [6.67261e-07, 6.72742e-07, 7.01002e-07, 7.03235e-07, 7.21761e-07, 7.34494e-07, 2.62599e-01, 2.62599e-01, 3.57453e-01, 3.57453e-01, 3.57453e-01, 3.60635e-01, 3.60635e-01, 3.60635e-01, 4.08373e-01, 
4.08385e-01, 4.08385e-01, 4.61679e-01, 4.61679e-01, 4.61679e-01] [Hz]                                                                                                                                            
[ Info: (3.60928e-12, 1.79179e-12)                                                                                                                                                                               
[ Info: (2.66454e-15, 1.42247e-15)                                                                                                                                                                               
[ Info: 1.8647224970419253e-10                                                                                                                                                                                   
#####################################################                                                                                                                                                            
# unit_cube_modes_arnoldimethod                                                                                                                                                                                  
Vibration modes of unit cube  of almost incompressible material.                                                                                                                                                 

Eigenvalues: [6.67261e-07, 6.72742e-07, 7.01002e-07, 7.03235e-07, 7.21761e-07, 7.34494e-07, 2.62599e-01, 2.62599e-01, 3.57453e-01, 3.57453e-01, 3.57453e-01, 3.60635e-01, 3.60635e-01, 3.60635e-01, 4.08373e-01, 
4.08385e-01, 4.08385e-01, 4.08385e-01, 4.61679e-01, 4.61679e-01] [Hz]                                                                                                                                            
[ Info: (1.44905e+00, 7.83647e+00)                                                                                                                                                                               
[ Info: (1.11022e-15, 9.80491e-01)                                                                                                                                                                               
[ Info: 0.07090617444834676

The "info" prints orthogonality checks and the residual norm. Clearly the modes are not mass orthogonal for the ArnoldiMethod.jl computation. The orthogonality checks begin on line https://github.com/PetrKryslUCSD/FinEtoolsDeforLinear.jl/blob/d44694b9829a4feb2755c89341ac5207f9e7d0ef/examples/dynamics/free_vibration/3-d/unit_cube_mode_examples.jl#L55 The residual is ten orders of magnitude bigger for ArnoldiMethod.jl.

The EP computation is implemented in https://github.com/PetrKryslUCSD/GEPHelpers.jl.git.

I may be misunderstanding something, but I tried to reproduce the example from the ArnoldiMethod.jl documentation exactly.

haampie commented 8 months ago

Looks like something is not entirely stable, if you restrict to fewer eigenpairs it's fine.

Does the transformation A⁻¹Bx = λ⁻¹x have anything to do with it? Only thing that jumps out is that there's a few orders of magnitude difference between the cluster of eigenvalues close to 0 and the rest.

Other solves presumably preserve symmetry better, or so?

PetrKryslUCSD commented 8 months ago

The first matrix is singular (six eigenvalues are zero), hence I shift. There are repeated eigenvalues.

Are vectors normalized with respect to the second matrix anywhere?

I use this code:

struct ShiftAndInvert{TA, TB, TT}
    A_factorization::TA
    B::TB
    temp::TT
end

function (M::ShiftAndInvert)(y, x)
    mul!(M.temp, M.B, x)
    # ldiv!(y, M.A_factorization, M.temp)
    y .= M.A_factorization \ M.temp
end

function construct_linear_map(A, B)
    a = ShiftAndInvert(cholesky(A), B, Vector{eltype(A)}(undef, size(A,1)))
    LinearMap{eltype(A)}(a, size(A,1), ismutating=true)
end

function arnoldimethod_eigs(K, M; nev::Integer=6, ncv::Integer=max(20,2*nev+1), which=:SM, tol=0.0, maxiter::Integer=300, sigma=nothing, v0::Vector=zeros(eltype(K),(0,)), ritzvec::Bool=true, explicittransform::Symbol=:auto, check::Integer=0)
    which == :SM || error("Argument which: The only recognized which is :SM")
    sigma == nothing || error("Argument sigma not supported")
    ritzvec == true || error("Argument ritzvec not supported")
    explicittransform == :none || error("Argument explicittransform only supported as :none")

    decomp, history = partialschur(construct_linear_map(K, M), nev=nev, tol=tol, restarts=maxiter, which=LM())
    d_inv, v = partialeigen(decomp)
    d = 1 ./ d_inv
    d = real.(d)
    ix = sortperm(real.(d))

    return d[ix], v[:, ix], history.nconverged
end

haampie commented 8 months ago

Can you compare with other_algorithms(construct_linear_map(K, M))? I think the other methods you use may internally use a different algorithm for the generalized symmetric eigenproblem, so unclear if this is a bug in ArnoldiMethod or an instability of the algorithm after the transformation.

PetrKryslUCSD commented 8 months ago

I'm afraid I don't quite follow what it is is you want me to do.

haampie commented 8 months ago

If I understand correctly you're transforming a symmetric generalized eigenproblem into a non-symmetric standard eigenproblem, like this:

Ax = λBx
A⁻¹Bx = λ⁻¹x

And then you use ArnoldiMethod (basically the Krylov-Schur algorithm) to solve this non-symmetric problem.

That's probably different from how Arpack.jl (or whatever the package is) solves the generalized eigenproblem, as they might do Lanczos iteration with a B-inner product, or something along those lines, which may converge better or not, or have a different stopping criterion, etc.

So to check if this is a bug in ArnoldiMethod, my question is if you can feed this ^ non-symmetric standard eigenproblem to Arpack / KrylovKit and see if it's giving the same output. If so the problem might be with the matrix, and if not then it's likely a bug / instability in ArnoldiMethod.

Otherwise I don't see how to narrow it down.

PetrKryslUCSD commented 8 months ago

Okay, now I see. Well, I believe matlab uses the same algorithm and converges just fine. I will explore Arpack.

PetrKryslUCSD commented 8 months ago

I think I misunderstood the example in https://julialinearalgebra.github.io/ArnoldiMethod.jl/stable/usage/02_spectral_transformations.html I incorrectly assumed that the resulting eigenvectors will be orthogonal for relative to the B matrix. However, upon second reading I doubt that.

PetrKryslUCSD commented 8 months ago

I added the orthogonalization. However, occasionally I run into problems. It appears to be random (initial guesses of the eigenspaces?)

During testing I got this error:

 Info: Number of eigenvalues 9                                                                                                                                                                                  
ERROR: LoadError: "QR algorithm did not converge"                                                                                                                                                                
Stacktrace:                                                                                                                                                                                                      
  [1] local_schurfact!(H::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.OneTo{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}, start::Int64, to::Int64, Q::Matrix{Float64}, tol::Float64, maxiter::Int64)      
    @ ArnoldiMethod C:\Users\pkonl\SublimeJulia_1_10_0\assets\.julia-1.10.0-depot\packages\ArnoldiMethod\JdEiw\src\schurfact.jl:320                                                                              
  [2] local_schurfact!(H::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.OneTo{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}, start::Int64, to::Int64, Q::Matrix{Float64}, tol::Float64, maxiter::Int64)      
    @ ArnoldiMethod C:\Users\pkonl\SublimeJulia_1_10_0\assets\.julia-1.10.0-depot\packages\ArnoldiMethod\JdEiw\src\schurfact.jl:314 [inlined]                                                                    
  [3] _partialschur(A::LinearMaps.FunctionMap{Float64, GEPHelpers.ShiftAndInvert{SparseArrays.CHOLMOD.Factor{Float64, Int64}, Symmetric{Float64, SparseMatrixCSC{Float64, Int64}}, Vector{Float64}}, Nothing, true}, ::Type{Float64}, mindim::Int64, maxdim::Int64, nev::Int64, tol::Float64, restarts::Int64, which::ArnoldiMethod.LM)                                                                                           
    @ ArnoldiMethod C:\Users\pkonl\SublimeJulia_1_10_0\assets\.julia-1.10.0-depot\packages\ArnoldiMethod\JdEiw\src\run.jl:199                                                                                    
  [4] partialschur(A::LinearMaps.FunctionMap{Float64, GEPHelpers.ShiftAndInvert{SparseArrays.CHOLMOD.Factor{Float64, Int64}, Symmetric{Float64, SparseMatrixCSC{Float64, Int64}}, Vector{Float64}}, Nothing, true}; nev::Int64, which::ArnoldiMethod.LM, tol::Float64, mindim::Int64, maxdim::Int64, restarts::Int64)                                                                                                             
    @ ArnoldiMethod C:\Users\pkonl\SublimeJulia_1_10_0\assets\.julia-1.10.0-depot\packages\ArnoldiMethod\JdEiw\src\run.jl:106                                                                                    
  [5] partialschur                                                                                                                                                                                               
    @ C:\Users\pkonl\SublimeJulia_1_10_0\assets\.julia-1.10.0-depot\packages\ArnoldiMethod\JdEiw\src\run.jl:94 [inlined]                                                                                         
  [6] __arnoldimethod_eigs(K::Symmetric{Float64, SparseMatrixCSC{Float64, Int64}}, M::Symmetric{Float64, SparseMatrixCSC{Float64, Int64}}; nev::Int64, ncv::Int64, which::Symbol, tol::Float64, maxiter::Int64, sigma::Nothing, v0::Vector{Float64}, ritzvec::Bool, explicittransform::Symbol, check::Int64)                                                                                                                      
    @ GEPHelpers C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\src\gep_smallest.jl:121                                                                                                                            
  [7] __arnoldimethod_eigs                                                                                                                                                                                       
    @ C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\src\gep_smallest.jl:101 [inlined]                                                                                                                             
  [8] gep_smallest(K::SparseMatrixCSC{Float64, Int64}, M::SparseMatrixCSC{Float64, Int64}, neigvs::Int64; method::Symbol, orthogonalize::Bool, which::Symbol, tol::Float64, maxiter::Int64, sigma::Nothing, ritzvec::Bool, v0::Vector{Float64})                                                                                                                                                                                   
    @ GEPHelpers C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\src\gep_smallest.jl:45                                                                                                                             
  [9] top-level scope                                                                                                                                                                                            
    @ C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\test\test_methods.jl:18                                                                                                                                       
 [10] include(fname::String)                                                                                                                                                                                     
    @ Base.MainInclude .\client.jl:489                                                                                                                                                                           
 [11] top-level scope                                                                                                                                                                                            
    @ C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\test\runtests.jl:29                                                                                                                                           
 [12] include(fname::String)                                                                                                                                                                                     
    @ Base.MainInclude .\client.jl:489                                                                                                                                                                           
 [13] top-level scope                                                                                                                                                                                            
    @ none:6                                                                                                                                                                                                     
in expression starting at C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\test\test_methods.jl:1                                                                                                                    
in expression starting at C:\Users\pkonl\Documents\00WIP\GEPHelpers.jl\test\runtests.jl:29

PetrKryslUCSD commented 8 months ago

I switched to a symmetric version of the shift-and-invert standard eigenvalue problem.

I'm still having trouble with accuracy. For instance, partialschur claims to have converged on three eigenvalues:

history = Converged: 3 of 3 eigenvalues in 11 matrix-vector products                                                                                                                                              
d = [3.15039952692389, 0.39478986259250065, 0.39478417604303223]

However, all three should be 3.94784e-01.

If you have some time, have a look at the output of the test of the package. https://github.com/PetrKryslUCSD/VibrationGEPHelpers.jl.git I'm trying to control the accuracy, https://github.com/PetrKryslUCSD/VibrationGEPHelpers.jl/blob/9ef9a50e713342d0b7562896b8036efc18d7365e/src/gep_smallest.jl#L122, but I'm not sure that I'm doing the right thing.

PetrKryslUCSD commented 7 months ago

@haampie Could you give me some pointers how to control the accuracy please? Thanks.

haampie commented 7 months ago

Hi @PetrKryslUCSD, as you can see I haven't touched the code much in the last 5 years, so would have to make time to look into the issue.

The method is a subspace method: a subspace is expanded iteratively until large enough, then the eigenproblem is projected to it, which involves doing a small dense Schur factorization. Then the subspace is reduced in size, retaining only the best approximate eigenvectors to eigenvalues of interest, and expanded again... etc.

Possible sources of issue:

the package implements dense Schur factorization (QR algorithm) in pure Julia, and doesn't use LAPACK. Maybe the dense matrix you end up with is "difficult".

It's implemented here: https://github.com/JuliaLinearAlgebra/ArnoldiMethod.jl/blob/master/src/schurfact.jl
- First you could check how many iter it takes by adding an @info iter there. Usually this is not too many compared to the matrix size (although there are exceptions)
- Also I noticed that it doesn't warn if it did not converge (return value is ignored).
- If you have reason to think this is the issue, you could see if replacing it with the relevant LAPACK function (for Float64/Float32) fixes things.
Retaining the best approximate eigenvectors involves reordering of the upper diagonal R matrix in the dense Schur factorization. I don't think there is a lot of theory w.r.t. stability of it, but people use it (also lapack). It involves solving tiny Sylvester equations in the real arithmetic case. One thing you could try is use complex arithmetic instead of real (I'm assuming you're using real matrices), since the real arithmetic case is more involved and may have more opportunities for things to go wrong. I can imagine errors can happen when two eigenvalues are really close. (Haven't checked the implementation, but if this is the issue, could also consider not sorting when eigenvalues are very close, and see if that makes things better)

If that is too involved, simple things you can do: compare the number of iterations with other packages implementing roughly the same algorithm. If ArnoldiMethod takes fewer, maybe something is different with the stopping criterion.

haampie commented 7 months ago

@PetrKryslUCSD looks like increasing maxdim a bit fixes the issue, but didn't spend enough time to figure out why. The only difference I see is that more eigenpairs converge before the first restart. Pretty sure the problem is with repeated eigenvalues, which are numerically not exactly repeated, but just very close.

For debugging I used:

    map = construct_linear_map(K, M)
    decomp, history = partialschur(map, nev=nev, tol=1e-16, maxdim=..., restarts=maxiter, which=LM())
    d_inv, v = partialeigen(decomp)
    print("Error in the Schur decomposition: ")
    tmp = similar(decomp.Q)
    mul!(tmp, map, decomp.Q)
    results = [1:size(tmp, 2) norm.(eachcol(tmp - decomp.Q * decomp.R)) d_inv]'
    show(stdout, "text/plain", results)
    println()

and you can see that the QR decomp has large error with default maxdim, but errors are gone when increased to maxdim=50 or so.

haampie commented 7 months ago

Let's close as duplicate of #112, it's likely the same problem.

PetrKryslUCSD commented 7 months ago

I wish I could say that the problem was fixed, but while the eigenvalues are now coming out okay (I increased the limit to maxdim = min(max(40, 2 * nev), size(K,1))), the eigenvectors are not mass-orthogonal. By a substantial margin. Not sure what is going on.

PetrKryslUCSD commented 7 months ago

The eigenvectors that are the most affected are the rigid body displacements: #4

and #5

These two are strongly non mass-orthogonal (~0.1).

PetrKryslUCSD commented 7 months ago

Here is the reduced mass matrix. It is expected to be identity. Its accuracy is affected for the first six modes (rigid body displacements): note the non-zero off diagonal elements.

JuliaLinearAlgebra / ArnoldiMethod.jl

Generalized EP: convergence sometimes lacking #114