Closed johnmyleswhite closed 10 years ago
This is also the case with sqrt. This is what dynamic languages are meant for. The types also look fine. We can keep it the way it is IMO.
That's not how sqrt
behaves:
julia> sqrt(1.0)
1.0
julia> sqrt(-1.0)
ERROR: DomainError()
in sqrt at math.jl:118
julia> sqrt(1.0 + 0.0 * im)
1.0 + 0.0im
julia> sqrt(-1.0 + 0.0 * im)
0.0 + 1.0im
eig
behaves the same as sqrtm
.
Oops, sqrt
was a bad example. But yes, eig
does this.
Ok. This behavior seems undesirable to me, but I suppose this is traditional enough that I should just learn to accept it. I would have thought we'd want behavior like fft
has.
I agree that this does not seem great from the type-stability perspective.
One option would be to return a real matrix when the input is a matrix of Symmetric
type and a complex matrix otherwise. That also has the advantage of avoiding the check when you've constructed the matrix to be symmetric. A big disadvantage of the current arrangement is that there isn't any way to call sqrtm type stably. We could do a similar scheme for eig
too. The nice: the returned array is always complex and it's always correct but it might happen to be a purely real-valued complex array. If you know the matrix is symmetric and you want a real matrix back, you can wrap it in Symmetric
first and get exactly what you want.
The underlying issue is that special symmetries of the matrix (Hermitian, real symmetric etc) allow for special-cased algorithms that are significantly faster than the generic algorithms. Functions like sqrtm
can be computed more stably and quickly for complex Hermitian than for general unsymmetric complex, and so the current methods actually check for special symmetries and explicitly dispatch appropriate specialized methods. If we turn off these symmetry detection checks, we compromise performance. If we explicitly cast the return into a Complex
matrix type, we also potentially compromise performance if we have to force memory copies to create the desired type.
If a user wants to explicitly cast the return into a Complex
matrix type, they should feel free to do so. But there seems no way to force the issue without compromising performance.
It might be nice to mention the type instability issue in the documentation for sqrtm
?
Yeah, we should document this. This behavior isn't specific to sqrtm however; rather it is also a feature of other functions, most notably linear solve and eig.
Reopening as documentation issue.
We should really be documenting the return type of every function.
My earlier analysis of the sqrtm
polyalgorithm turns out to be not quite right. This is what actually happens when sqrtm(A)
is called:
A matrix that is neither Hermitian nor symmetric will in general have a square root with complex elements. A matrix that is Hermitian or symmetric will also have a square root with complex elements unless it is positive definite, in which case we can specialize to the case of square root with real elements. Since you can determine whether a Hermitian/symmetric matrix is positive definite essentially for free after computing its eigenvalues, and recomposing a real matrix as a product of real factors is cheaper than in the complex case, I still think that the current polyalgorithm is optimally performant.
The current implementation of the individual components, however, suggests some beneficial refactoring. For example:
Triangular
s quickly.A matrix that is neither Hermitian nor symmetric will in general have a square root with complex elements.
In general yes, but e.g. the special case of real matrices with no eigenvalues on the non-positive real axis (positive and conjugate pairs are fine) have real square roots. I wrote about some of the peculiarities of the matrix square root in a mailing list thread about sqrtm when it was implemented:
Like the scalar square root, the matrix square root often has multiple solutions, and even more so than in the scalar case. E.g. a positive definite nxn Hermitian matrix with distinct eigenvalues has 2^n different square roots, whereas repeated eigenvalues can allow infinitely many square roots. E.g. [-1 0;0 -1] has the obvious square roots [i 0;0 i], [i 0;0 -i], [-i 0;0 i], and [-i 0;0 i] but also non-diagonal solutions like [1 2;-1 -1], which has the property of being real. In fact real matrices often have real square roots, and it could be interesting to have a real variation of sqrtm. This is thoroughly investigated, at least in the non-singular case, in
N. J. Higham, Computing real square roots of a real matrix, Linear Algebra Appl. 88/89 (1987) 405–430.
As a special case, I believe that the implemented algorithm does compute a real solution, up to numerical noise, for real matrices with no eigenvalues on the non-positive real axis. This is easily checked in the Schur decomposition and the imaginary part could be dropped in that case. It would be good if someone could verify that this is correct.
The singular non-diagonalizable case can be quite tricky. E.g. [0 1;0 0] has no square root at all, neither real nor complex, but if we add another dimension we have [0 0 1;0 0 0;0 1 0]^2 = [0 1 0;0 0 0;0 0 0]. The implemented method not-a-numbers both of these.
I did write a Julia implementation of Higham's real square root method (starting from the real Schur decomposition instead of the complex Schur decomposition) but it's really not worth the code complexity.
@GunnarFarneback thanks for the back history. I'll have to think for a bit about how easy it would be to check for the special case you mentioned in the triangular sqrtm routine.
The mailing list discussion belies a much more complicated issue, namely that of defining a principal square root. It bears repeating that matrices don't have unique square roots. I think most people would expect matrix-valued functions to produce the "canonical" version that is implemented, which is to factorize, square root, and rotate back to the original basis. But perhaps this is worth mentioning in the docs.
Maybe we should change
sqrtm
so that it always returns a complex array?