Fix locking and purging

haampie commented 7 months ago

Closes #112 Closes #114

It looks like ArnoldiMethod.jl was not robust when a new eigenpair starts to converge with an eigenvalue that appears between (in order requested by user) already locked ones. This can happen with repeated eigenvalues, where the eigenvalues converge in a non-typical order.

The issue was tracked down to unstable partitioning of eigenvalues, as well as assuming that the Schur vectors in 1:active-1 would never change. Since the implementation only partitions Schur vectors in groups (locked, retain, truncate), the first vector to be purged could be anywhere in the range 1:active-1. The fix is to look for the first vec in that range that moves groups, and make a change of basis from that index.

haampie commented 7 months ago

@PetrKryslUCSD in what way did you mean

I'm afraid the change did not fix things at my end.

in the sense that it's not finding all multiples reliably? Or in the sense that the eigenvectors are incorrect?

As far as I can see, this way the method shouldn't mess up already converged vectors when some of the multiples are only found later.

PetrKryslUCSD commented 7 months ago

The evecs are still not mass-orthogonal for the six zero evals. (If I did it right. The change to active was the only one, correct?) Petr Krysl Prof. and Vice chair for undergraduate education Structural engineering department https://mailtrack.io/l/00b3d01aa0bab77b03ed5aedb4aa9e62168b30b1?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fuc-san-diego-structural-engineering-department%2F__%3B!!Mih3wA!DXYUp152SRRo03xCfQJ9NlAXovNVk-zBYRalwekzmzf2bGwdDMmY8gy9t6iQo9ok2_dvk8m987An93w0pnT_6lM%24&u=1197778&signature=4d4256d26a19dc8b University of California, San Diego 9500 Gilman Drive #0085 La Jolla, CA 92093

On Wed, Feb 7, 2024 at 12:53 PM Harmen Stoppels @.***> wrote:

@PetrKryslUCSD https://urldefense.com/v3/__https://github.com/PetrKryslUCSD__;!!Mih3wA!CvbRtlTHU7WLcEHPF6RSOt1VUZGrLLq8WcPjqMv0s2f9G6bivFVPfCy92h-vByZNADRFhj7qYgjrOLYA7VcKQFBO$ in what way did you mean

I'm afraid the change did not fix things at my end.

in the sense that it's not finding all multiples reliably? Or in the sense that the eigenvectors are incorrect?

As far as I can see, this way the method shouldn't mess up already converged vectors.

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/JuliaLinearAlgebra/ArnoldiMethod.jl/pull/116*issuecomment-1932863512__;Iw!!Mih3wA!CvbRtlTHU7WLcEHPF6RSOt1VUZGrLLq8WcPjqMv0s2f9G6bivFVPfCy92h-vByZNADRFhj7qYgjrOLYA7ZXVOdWJ$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ACLGGWF53P2PF2BD5VDXT4DYSPSTDAVCNFSM6AAAAABC55YYR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZSHA3DGNJRGI__;!!Mih3wA!CvbRtlTHU7WLcEHPF6RSOt1VUZGrLLq8WcPjqMv0s2f9G6bivFVPfCy92h-vByZNADRFhj7qYgjrOLYA7XKCcd8N$ . You are receiving this because you were mentioned.Message ID: @.***>

haampie commented 7 months ago

Yes. Note that it's using the standard inner product.

PetrKryslUCSD commented 7 months ago

Right. I am solving the symmetric standard eigenvalue problem based on the Cholesky decomposition, and then converting the eigenvectors to mass-orthogonal eigenvectors (https://github.com/PetrKryslUCSD/VibrationGEPHelpers.jl/blob/2a73a8fe679b6b9cd0f1a509e04177c0e5e2e297/src/gep_smallest.jl#L135).

haampie commented 7 months ago

Somewhat confused about the extra steps. Can't you take M = LL'? Then the M-inner product is used implicitly.

Something like:

Solve AX = BXΛ, XᵀBX = I, Λ diagonal, given the cholesky decomp B = LLᵀ

define

C = L⁻¹AL⁻ᵀ # if A is symmetric so is C
Y = LᵀX

and reformulate as standard eigenvalue problem

CY = YΛ

solve with partialschur (no need for partialeigen when C is symmetric)

decomp, = partialschur(C)
Y = decomp.Q
Λ = decomp.R
X = L⁻ᵀY

partialschur uses the standard inner product, meaning Y'Y = I, but then XᵀBX = I like requested.

FWIW, the partial Schur decomposition that ArnoldiMethod computes in your specific example unit_cube_modes_arnoldimethod is correct up to the given tolerance (1e-8?), I checked both norm(CY - YΛ) and norm(Y'Y - I). Makes me think the problem is currently not with this package.

PetrKryslUCSD commented 7 months ago

I am doing the obverse: I solve for the LM eigenvalues and then I need to solve for v at the end If partialeigen is not needed, I will try without. The eigenvalues are correct, and most eigenvectors are too. It is just the ones for the 6-times repeated zero eigenvalue that are not properly orthogonal.

Btw, the same conversion to a standard EVP is used successfully with KrylovKit.

haampie commented 7 months ago

In KrylovKit you're not passing tol, I think it defaults to 1e-12 whereas in ArnoldiMethod it's sqrt(eps(Float64)) = 1e-8. Does that make a difference? (I guess not, since the Schur vecs that come out of the ArnoldiMethod should be orthogonal up to machine precision regardless of tol, but just to be sure)

haampie commented 7 months ago

@PetrKryslUCSD, ok, so it looks like partialschur is fine, but partialeigen is the next candidate ~that is not robust in case of (numerically) repeated eigenvalues -- which is weird cause the implementation is just 2 lines, and it calls into LAPACK.~

Edit: it's just that eigen is not guaranteed to give you an orthogonal set of eigenvectors in case of repeated eigenvalues, for example:

julia> A = collect(Diagonal([1.0:10.0; 11.0; 11.0; 11.0]))
13×13 Matrix{Float64}:
 1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  3.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  4.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  5.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  6.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  7.0  0.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  8.0  0.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  9.0   0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  10.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0  11.0   0.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0  11.0   0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0   0.0   0.0  11.0

julia> Q = qr(rand(13,13)).Q * I;

julia> vals, vecs = eigen(Q'A*Q);

julia> norm(vecs'vecs - I)
0.4150189752244879

If you get rid of partialeigen:

    si = ShiftAndInvert(cholesky(K), M, Vector{eltype(K)}(undef, size(K, 1)), Vector{eltype(K)}(undef, size(K, 1)))
    m = LinearMap{eltype(K)}(si, size(K, 1), ismutating = true)
    decomp, history = partialschur(m, nev = nev, tol = 1e-12, which = LM())
    d = 1 ./ diag(decomp.R)[1:nev]
    v = si.A_factorization.PtL' \ decomp.Q[:, 1:nev]
    mass_orthogonalize!(v, M)
    return d, v, history.nconverged

things look correct:

check_M_orthogonality(v, M) = (6.661338147750939e-16, 2.1275776279638947e-13, 1, 12, 19, ...)

norm(K * v - M * v * Diagonal(d)) = 2.87659594639317e-11

haampie commented 7 months ago

That said, I still like to improve on this PR, cause with this PR the algorithm early exits when nev eigenvalues are converged, even if a not-yet-fully-converged eigenvalue is on the radar that matches the which criterion better (i.e. if a repeated eigenvalue starts to converge later, it is unlikely to be returned after this pr)

haampie commented 7 months ago

I think I have fixed the actual bug. Partitioning of converged eigenvalues was not using a stable partitioning routine. May need a bit more testing though... Could you test-drive with your matrices @PetrKryslUCSD? :)

PetrKryslUCSD commented 7 months ago

I used fix/keep-locked-vectors-in-place and I removed partialeigen. I also removed resetting of the tol.

Here is the printout from the test of the package

[ Info: Number of eigenvalues 12
history = Converged: 12 of 12 eigenvalues in 87 matrix-vector products
(norm(fs1 - fs[1:length(fs1)]) / norm(fs), frequency_tol) = (0.19548617242015892, 1.0e-6)
(r[1:2], max(10000 * eps(1.0), orthogonality_tol * max(maximum(d), 1.0)), r[3:5]) = ((5.213538876992721, 0.0005608828187519614), 5.10477332079472e-9, (6, 4, 6))
rm = [0.0 0.0 0.0 0.0 0.0 2.6466491736638157e-5 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 -9.268735138065245e-5 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0003109959982588931 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 -0.000560882818759354 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.00023771019177574475 0.0 0.0 0.0 0.0 0.0 0.0; 2.6466491425794098e-5 -9.268735130155907e-5 0.0003109959986802227 -0.0005608828187445688 0.00023771019200621838 5.213538876993007 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.755615327612583 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.7556153276130866 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.104773320795011 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.104773320794843 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.104773320795483]

Alas, the eigenvectors are still not orthogonal.

haampie commented 7 months ago

Can you share a reproducer?

PetrKryslUCSD commented 7 months ago

Please run the test of VibrationGEPHelpers. Look for output of the tests of ArnoldiMethod.

On Sun, Feb 11, 2024, 1:49 PM Harmen Stoppels @.***> wrote:

Can you share a reproducer?

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/JuliaLinearAlgebra/ArnoldiMethod.jl/pull/116*issuecomment-1937882626__;Iw!!Mih3wA!F7z1ugFUzzCWy36kzr7bpnCsXD2d0Nd6VXii5cWx2Au5hP6yXk1K38BmGbf9r58X4BNhnoUySCK0t_tMVSS9Bh7_$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ACLGGWGML6KGH467X7QRISLYTE4F5AVCNFSM6AAAAABC55YYR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZXHA4DENRSGY__;!!Mih3wA!F7z1ugFUzzCWy36kzr7bpnCsXD2d0Nd6VXii5cWx2Au5hP6yXk1K38BmGbf9r58X4BNhnoUySCK0t_tMVY7HAlxD$ . You are receiving this because you were mentioned.Message ID: @.***>

haampie commented 7 months ago

Thanks, tests pass when I disable locking, hinting that purging of locked vectors is necessary:

If many eigenvalues close to the target have converged, they're locked, but if then another handful better eigenvalues suddenly start to converge, and their total exceeds the number requested, then the previously converged values have to be tossed out.

Should be relatively straightforward to implement.

haampie commented 7 months ago

@PetrKryslUCSD can I ask you once more to test this ;p thanks for hanging in so far...

PetrKryslUCSD commented 7 months ago

Happy to do it. Do I use the same branch?

haampie commented 7 months ago

Currently the master branch

haampie commented 7 months ago

Current version 0.3.0

PetrKryslUCSD commented 7 months ago

Fantastic, all tests pass! Many thanks four looking into this, Harmen. Best regards,

haampie commented 7 months ago

Thank you for opening an issue and the great test cases :)

JuliaLinearAlgebra / ArnoldiMethod.jl

Fix locking and purging #116