partialschur finds only one of three degenerate eigenvectors

jlbosse commented 2 years ago

I have matrix with a three-fold degenerate extremal eigenvalue and ArnoldiMethod.partialschur finds only one of them while e.g. Arpack.eigs finds all three. See the following MWE:

using Arpack
using ArnoldiMethod
using SparseArrays

ham_sparse = sparse(I, J, V)     # the numerical values for I, J and V are hosted at https://justpaste.it/5naju

λ, φ = Arpack.eigs(ham_sparse, which=:SR)
print(real(λ))

decomp, _ = partialschur(ham_sparse, nev=6, which=SR())
λ, φ = partialeigen(decomp)
print(real(λ))

prints

[-4.785165067232189, -4.785165067232181, -4.785165067232169, -4.458749101599544, -3.5927605849611726, -3.5927605849611552]
[-4.785165067232141, -4.603267050112509, -4.4587491015995315, -3.59276058496115, -3.502637833700079, -3.2498680277756797]

PetrKryslUCSD commented 8 months ago

Is this still happening? I also have a problem with multiple repeated eigenvalues, and failure is sometimes observed.

RalphAS commented 8 months ago

I get 2 out of 3 with default settings for the current version, but all 3 with maxdim=40. Even Arpack needs a padded subspace to get the next multiplet right (I haven't got all of those with ArnoldiMethod yet).

haampie commented 7 months ago

@Jutho did you every look into restart issues with repeated/multiple/tightly clustered eigenvalues in IRA / Krylov-Schur?

In #114 it seems to sometimes completely destroy the accuracy of the partial schur decomp AQ ≈ QR. Not sure if a breakdown can be expected...

From the KrylovKit docs

From a theoretical point of view, Krylov methods can at most find a single eigenvector associated with a targetted eigenvalue, even if the latter is degenerate. In the case of a degenerate eigenvalue, the specific eigenvector that is returned is determined by the starting vector x₀. For large problems, this turns out to be less of an issue in practice, as often a second linearly independent eigenvector is generated out of the numerical noise resulting from the orthogonalisation steps in the Lanczos or Arnoldi iteration. Nonetheless, it is important to take this into account and to try not to depend on this potentially fragile behaviour, especially for smaller problems.

I haven't checked if/how classic ARPACK deals with this differently.

Would be worth it to compare the schursorts side by side, they're slightly different in how many vecs they keep when shrinking the subspace, but other than that it's mostly the same. I think ArnoldiMethod should be slightly more efficient by doing pretty much everything in-place, but as a result it's more bug prone. Q is where do ArnoldiMethod and KrylovKit deviate, maybe that's the easiest way to debug...

Jutho commented 7 months ago

Maybe I should revise the docs as this description sounds a bit too optimistic. I think that in general, it is very finicky to rely on numerical noise to generate you the new linearly independent eigenvectors that are not present in the initial starting vector. I have also seen several issues reported about something like this.

With the default values for KrylovKit.eigsolve (krylovdim=30, maxiter=100) and randomly generated initial vectors, it seems to find the correct first 6 eigenvalues with degeneracies (typically in 7 iterations = approximately 100 matrix vector products) for the matrix in this issue.

But something is strange with the ArnoldiMethod results reported above. It is not that it just does not find 3 linearly independent eigenvectors and thus reports higher lying eigenvalues. The second value it reports is not a correct eigenvalues for this matrix. I think this may point towards a bug.

RalphAS commented 7 months ago

This may account for the problem seen here: ISTM that the re-sorting of Ritz values in _partialschur should start at index active rather than 1, so as not to mix locked subspace content into the working set, or vice versa.

haampie commented 7 months ago

@RalphAS I think that might be an issue yes. So, a repeated eigenpair converges later than others, and then gets inserted somewhere between already locked vectors, whereas typically eigenpairs converge in order

However, completely at the end, reordering the schur form should still work.

Will have a look.

haampie commented 7 months ago

Looks like #116 resolves the issue, but I don't immediately see why.

Initially the following is locked:

real.(ritz.λs[ritz.ord[1:nlock]]) = [-4.785165067232145, -4.785165067232136, -4.458749101599527, -3.5927605849611477, -3.5026378337000823]

but then a duplicate is found and the last -3.50 eigenvalue is tossed out:

real.(ritz.λs[ritz.ord[1:nlock]]) = [-4.785165067232145, -4.785165067232136, -4.785115528292574, -4.458749101599527, -3.5927605849611477]

Looks like purging is where things are going wrong, cause suddenly the Arnoldi relation A * V[:, 1:k] = V[:, 1:k+1] * H[1:k+1,1:k] no longer holds. And it can't be that a ritz value is no longer considered converged.

--

I think the bug is that a not-yet-converged eigenvalue ends up between the locked ones, because of sorting.

PetrKryslUCSD commented 7 months ago

I'm afraid the change did not fix things at my end.

haampie commented 7 months ago

That's why I closed it ;p

haampie commented 7 months ago

@RalphAS thanks for pointing me somewhat in the right direction. In the end though, sorting only the active part is incorrect, as the algorithm would return the first eigenvalues that converged, instead of the best eigenvalues it can detect over time. For most applications that coincides (dominant eigenvalues converge first), but with repeated eigenvalues that was not the case: they start to converge rather late, sometimes by accident, and at that point you better toss out some already converged eigenvalues further from the target, in favor of those repeated eigenvalues closer to the target, and give them a few more iterations to converge.

PR #116 implements a double bug fix for purging of converged, unwanted Schur vectors.

On top of that, I noticed there that this package's partialeigen may unnecessarily destroy orthogonality of Schur vectors for symmetric matrices, even though partialeigen should be a no-op for those as it should coincide with the Schur decomp. I think I'll just document this for now, although I could make a high-level interface like eigs with a ::Diagonal / ::Hermitian wrapper, which avoids the call to partialeigen.

Since I'm the only maintainer of this package, feel free to review, otherwise I'll self-merge.

haampie commented 7 months ago

One thing I'm not 100% sure about is:

My implementation of purging retains converged vectors in the search subspace, they're simply at index > number of requested eigenvalues

I think Arpack may actually completely remove them from the search subspace: put them at index > mindim and they're automatically truncated at restart.

The latter gives more space for new eigenvectors to converge.

But is that helpful at all? I'm pretty sure the vector you completely toss out starts to converge immediately again, whereas by keeping it in the subspace its deflated.

PetrKryslUCSD commented 7 months ago

That is an interesting comparison.

haampie commented 7 months ago

In the end purging was implemented to actually remove the converged but unwanted vecs, because they may cause stability issues in the Krylov-Schur restart due to tiny residuals.

After they're removed, they may reappear due to rounding noise, but that's better than loss of precision at restart.

JuliaLinearAlgebra / ArnoldiMethod.jl

partialschur finds only one of three degenerate eigenvectors #112