Add QR algorithm - Githubissues

Aweptimum commented 8 months ago

In investigating the problems in #468, I decided to try implementing the qr algorithm to have an alternative way of getting the eigenvectors.

This is still a work in progress, but it adds:

NumericMatrix::upperHessenberg - converts a matrix to Upper Hessenberg form using householder reflections. I only have one test case, but it implicitly passes every Eigenvalue::qrAlgorithm test case so that's neat.
Eigenvalue::qrAlgorithm - finds the eigenvalues of a matrix using the QR Algorithm, which follows the same principle as the power method. It relies on upperHessenberg. I implemented it using the Wilkinson Shift strategy and deflation, so when the lowest diagonal element converges it recurses using the submatrix that doesn't include that diagonal. It's pretty fast, none of the test cases take more than ~100 iterations to converge. All tests pass.
Eigenvector::qrAlgorithm - this just takes the output of the above method and feeds it to Eigenvector::eigenvectors. The problem is that it fails in cases where the eigenvalues are correct, so I'm pretty sure something is wrong in the eigenvectors method. There is a way to directly solve for the eigenvectors using the QR algorithm, but I haven't fully understood it yet.
Adds failing test cases from #468

I'm not sure how you'd want to expose Eigenvector::qrAlgorithm in the public API, so I haven't worked on that. Might need a refactor similar to Eigenvector

Aweptimum commented 8 months ago

So the Eigenvector::qrAlgorithm test failures aren't actually that bad - The first two failures return the same eigenvectors, but the columns are swapped around. The second two failures have repeated eigenvectors, but they are cases where the eigenvalues are repeated. I think this because I'm feeding the eigenvalues directly into the eigenvectors method. It looks like the method tries to deal with dupes, but I'm not sure if I need to change the eigenvalue input to match its expectations.

Given:
[2, 0, 1]
[2, 1, 2]
[3, 0, 4]

Eigenvalues: -3, 1, 1

Expected eigenvectors:
[0.26726124191242, 0, 0.70710678118655]
[0.53452248382485, 1, 0]
[0.80178372573727, 0, -0.70710678118655]

Actual:
[0.26726124191242, 0, 0]
[0.53452248382485, 1, 1]
[0.80178372573727, 0, 0]

@Beakerboy what sort of input does the eigenvectors method expect for the $eigenvalues variable? I thought I'd try only giving it unique eigenvalues but it complains the length of the array doesn't match the length of the diagonal.

Beakerboy commented 8 months ago

@Aweptimum According to the source, it's an array of floats and is the same size as the matrix. Duplicated should be listed multiple times. When you say "it complains" what is the error that is reported?

Aweptimum commented 8 months ago

I just called array_unique on the eigenvalues returned from the QR Algorithm and passed them in. The error is at line 72 in Eigenvectors: MathPHP\Exception\MatrixException: Matrices have different number of rows The identity matrix is constructed using the length of $eigenvalues array, so I figured it might be intended that the duplicates should be passed in

Beakerboy commented 8 months ago

Yeah, that's what I said, I think the duplicates should be passed in, so don't call array_unique.

Aweptimum commented 8 months ago

Yeah, I meant to communicate it was my assumption and I'm glad to have your affirmation. I'm having a hard time writing today.

I guess there's some weird edge case in the eigenvectors method when the supplied eigenvalues contain repeats. I put the failing eigenvector cases (#4 and #5 in dataProviderForEigenvector) in the eigenvalue test and they pass. The baffling part is the identity matrix case, where the eigenvalues are all 1's, returns the correct eigenvectors.

Beakerboy commented 8 months ago

From scanning the code, it looks like the algorithm is supposed to return two orthogonal vectors when it sees the first “1” in the eigenvalue list. Then when it is generating the return matrix, it will sort them so that one of them is assigned to the position of the first “1” and the other the position of the second “1”. It’s been years since I wrote this so no idea what might be going on.

Aweptimum commented 8 months ago

It's ok, I found the culprit! It's this line here The floating point comparison fails in every case (except for the identity matrix one), so $key is always false when it isn't meant to be. I rewrote it to this:

$found = \array_column($solution_array, 'eigenvalue');
$filtered = \array_filter($found, fn($v) => Arithmetic::almostEqual($v, $eigenvalue, $A->getError()));
$key = array_key_first($filtered);
if ($key === null) {
    // ... solve
}

And it returns the correct solutions for the repeats. I think I'll submit this patch as a separate PR so mark can just merge it, although array_key_first was introduced in 7.3 so I'll have to use something else

Aweptimum commented 7 months ago

I have found another problem, and that is the original matrix that produces a pseudo-inverse failure also produces an rref failure

If you take this guy and pass it to the eigenvectors method with its eigenvalues:

[2, 0, 0, 0, 0, 0]
[0, 2, 0, 0, 0, 1729.7]
[0, 0, 2, 0, -1729.7, 0]
[0, 0, 0, 0, 0, 0]
[0, 0, -1729.7, 0, 2828785.69, 0]
[0, 1729.7, 0, 0, 0, 2828785.69]

The first step the eigenvector method does when it loops over the given eigenvalues is to subtract the largest eigenvalue from the matrix (2828791.05765 in the first loop) and then get the rref. The rref returned by the matrix method on line 74 is this:

[1, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, -0.00061146305530409]
[0, 0, 1, 0, 0.00061146305530409, 0]
[0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0]

Whereas wolfram and this random rref calculator I found produce an identity matrix (the latter is using the scaled-down matrix in the eigenvectors method)

I'll add the matrix as a test case to the ReducedRowEchelonFormTest and submit a separate PR with it so anyone can look at what's going on

Aweptimum commented 4 months ago

Ok, been a bit, but I found the real problem and I'm not sure what to do about it

In the above test case, when it gets to the 3rd eigenvalue, the resulting RREF is an identity matrix. This causes the algorithm to die.

The matrix copied from above:

[2, 0, 0, 0, 0, 0]
[0, 2, 0, 0, 0, 1729.7]
[0, 0, 2, 0, -1729.7, 0]
[0, 0, 0, 0, 0, 0]
[0, 0, -1729.7, 0, 2828785.69, 0]
[0, 1729.7, 0, 0, 0, 2828785.69]

The eigenvalues: 2828786.747649, 2828786.747649, 2, 0.94235235527, 0.94235235527, 0

The offending eigenvalue, 2, produces this $T (A-λI) matrix (after everything is scaled down by the largest matrix element):

[3.7388808264281E-7, 0, 0, 0, 0, 0]
[0, 3.7388808264281E-7, 0, 0, 0, 0.00061146378324616]
[0, 0, 3.7388808264281E-7, 0, -0.00061146378324616, 0]
[0, 0, 0, -3.3312906859285E-7, 0, 0]
[0, 0, -0.00061146378324616, 0, 0.99999966687093, 0]
[0, 0.00061146378324616, 0, 0, 0, 0.99999966687093]

Which I plugged into wolfram and asked for the rref. It also gives an identity.

So nothing is wrong with the RREF implementation or eigenvalues implementation. However, wolfram can solve for the eigenvectors of the original matrix, so that must mean that the eigenvector procedure is missing something - there is either a way to recover from the RREF not pointing to a solution or an alternative method of getting the eigenvectors.

Edit: @Beakerboy idk if you would know anything of that, but any thoughts? Would you know a resource to look at?

markrogoyski commented 4 months ago

Hi @Aweptimum,

Thanks for your continued work on this.

I'm presuming the eigenvalue calculation is correct. If that is the case, and we think the eigenvector calculation is the issue, are you able to reproduce the issue with a simpler matrix, or another matrix, which might help identify what characteristics are causing the issue?

Alternatively, is there a more robust method of calculating eigenvectors that we could be using?

Aweptimum commented 4 months ago

The eigenvalue calculation is correct, yes, I have the correct ones verified by numpy + wolfram + a professor who has matlab I'm not sure how to reproduce it with a simpler matrix :/ but I can try some ideas

I'm not sure if there's a more "robust" way because the eigenvector algorithm is just solving (A-λI) = 0 directly That said, the professor I tried getting help from just pointed me at the matlab docs for eig, which uses the QZ Algorithm in the general case.

The QZ Algorithm is analogous to the QR algorithm, but it uses the results of the QR algorithm to form what's called the Schur Decomposition, then solves the system using the decomposition rather than the original matrix, and then transforms the eigenvectors back to match the original matrix.

The original paper is behind paywalls and the only implementations I can find are in C or Lapack (and I can read neither). However, there is a nasa paper someone wrote on extending the algorithm that might be plain enough to work from: https://ntrs.nasa.gov/citations/19730018792

Beakerboy commented 4 months ago

I thought I’d at least chime in to say that I have no idea how to help, but a quick scan of the nasa paper looks like there’s enough direction that a well motivated individual could just follow the provided steps and implement it. I am not a mathematician, most of my contributions here were just implementing algorithms from Wikipedia, textbooks, or other software packages in PHP, just like this case.

Aweptimum commented 1 day ago

Sorry it's been a while, but my previous job laid me off a few months ago. I was previously allowed to contribute to math-php on the job, but my current job no longer touches PHP (and probably never will).

I can no longer work on my PR's, basically :(

markrogoyski / math-php

Add QR algorithm #472