stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
737 stars 185 forks source link

Add Schur decomposition for complex matricies #2704

Closed WardBrian closed 2 years ago

WardBrian commented 2 years ago

After FFT (#20) another complex-valued special function which would be good to have is Schur decomposition: https://en.wikipedia.org/wiki/Schur_decomposition

This is probably a natural candidate to be the first function in the math library which returns a tuple as a result, if we wait until tuples are implemented. Otherwise we will need to do what we have for svd and have two functions which each return half of the answer.

This is supported in Eigen: https://eigen.tuxfamily.org/dox/classEigen_1_1ComplexSchur.html

bob-carpenter commented 2 years ago

I've even tested that Schur decomposition works as part of the original complex PR. Here's a version that worked using the two functions approach:

https://github.com/stan-dev/math/commit/6490ee33e650d1aefe4e44126a39cb1e2975e1e7

But then for some reason I removed the test in a later commit before it ever gets merged:

https://github.com/stan-dev/math/commit/15539b2426a7442f4e053a6733e132c412f19f11

At least that explains why I can't find them any more in the current code.

spinkney commented 2 years ago

The schur decomposition can be used to calculate the matrix square root (they did it in Jax https://github.com/google/jax/pull/9544/commits/cb732323f376a26cf88e177a96ebd955074acbfc) and may be used in calculating the matrix logarithm ( http://eprints.maths.manchester.ac.uk/1851/). It will be nice to have this in the language.

bob-carpenter commented 2 years ago

I know it works because I used the Schur decomposition to test complex numbers. Here's an implementation that used to work before the math library started drifting faster than I could keep up:

https://github.com/stan-dev/math/commit/15539b2426a7442f4e053a6733e132c412f19f11#diff-86d598be91262793b47ffdd8de067fad1a10c1df8c444aa51a7576f04902692e

WardBrian commented 2 years ago

I ported @bob-carpenter's code above to use the template metaprograms required in this branch: https://github.com/stan-dev/math/tree/feature/2704-schur-decomposition

The mix tests, including the ones originally commented out, all pass.

Unfortunately, I tried to add a prim test, but it seems that the results from scipy.linalg don't agree with Eigen's implementation, so I'm not sure how to proceed on that.

spinkney commented 2 years ago

Can you see if Eigen's implementation agrees with wolfram alpha? Example here

WardBrian commented 2 years ago

I can't convince wolfram alpha to output the complex Schur decomposition, only the real one

spinkney commented 2 years ago

Screen Shot 2022-09-13 at 9 28 10 AM

spinkney commented 2 years ago

If you have a matrix I can test using the QZ library in R which is based on Fortran routines.

WardBrian commented 2 years ago

I've been using the example from scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.schur.html [[0, 2, 2], [0, 1, 2], [1, 0, 1]]

On WA just now I tried [[0, 2, 2], [0, 1i, 2], [1, 0, 1]] to force complex output. For this matrix, Eigen, Scipy, and WolframAlpha all produce different results for T:

Scipy:

[[ 2.43093485+0.13461849j  0.72382986+1.23632347j -0.79116896-1.50947587j]
 [ 0.        +0.j         -0.84412614-0.70171943j  0.09863006-0.31734164j]
 [ 0.        +0.j          0.        +0.j         -0.58680871+1.56710094j]]

WolframAlpha:

[[2.43093 + 0.134618 i, 1.31656 - 0.564893 i, -0.791169 - 1.50948 i],
[0 i, -0.844126 - 0.701719 i, 0.327125 + 0.058507 i],
[0 i, 0 i, -0.586809 + 1.5671 i]]

Eigen:

    (2.43093,0.134618)     (1.2993,-0.603531)     (-1.19052,1.21948)
                 (0,0)  (-0.844126,-0.701719) (-0.0358622,-0.330375)
                 (0,0)                  (0,0)     (-0.586809,1.5671)

As you can see, the off-diagonal entries are different for each.

WardBrian commented 2 years ago

The decomposition returned by Eigen does satisfy A = U T U*, so maybe the unit test is just that?

  Eigen::MatrixXcd A(3, 3);
  A << 0, 2, 2, 0, c_t(0,1), 2, 1, 0, 1;

  auto A_t = complex_schur_decompose_t(A);
  auto A_u = complex_schur_decompose_u(A);

  auto A_recovered = A_u * A_t * A_u.adjoint();

  expect_complex_mat_eq(A, A_recovered);
bob-carpenter commented 2 years ago

From Wikipedia on Schur decomposition:

Although every square matrix has a Schur decomposition, in general this decomposition is not unique.

Luckily, we don't need to test prim against existing answers. We can just verify that

$A = U \cdot T \cdot U^{-1}$

where $U$ is unitary and $T$ is triangular. You can also verify that the diagonal entries of $T$ are the Eigenvalues. I put this in Eigen's notation for matrixU() and matrixT().

spinkney commented 2 years ago

Is the decomposition supposed to be unique?

bob-carpenter commented 2 years ago

Exactly!

spinkney commented 2 years ago

Thanks Bob, just saw your response.

WardBrian commented 2 years ago

That's what I get for not making it down into the "Notes" section of Wikipedia. Thanks