JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

Matrix `exp` and `log` are not inverse of each other #54833

Closed olivierverdier closed 1 week ago

olivierverdier commented 4 months ago
  1. Output of versioninfo()

    Julia Version 1.10.3 Commit 0b4590a5507 (2024-04-30 10:59 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: macOS (arm64-apple-darwin22.4.0) CPU: 8 × Apple M1 WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1) Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) Environment: JULIA_EDITOR = emacsclient JULIA_INPUT_COLOR = bold JULIA_ANSWER_COLOR = bold

  1. How Julia was installed

With juliaup

  1. Minimal working example:
M = [0.9949357359852791 -0.015567763143324862 -0.09091193493947397 -0.03994428739762443 0.07338356301650806; 0.011813655598647289 0.9968988574699793 -0.06204555000202496 0.04694097614450692 0.09028834462782365; 0.092737943594701 0.059546719185135925 0.9935850721633324 0.025348893985651405 -0.018530261590167685; 0.0369187299165628 -0.04903571106913449 -0.025962938675946543 0.9977767446862031 0.12901494726320517; 0.0 0.0 0.0 0.0 1.0]
sdiff = exp(log(M)) - M
-log10(maximum(abs.(sdiff))) # < 3

The same test in Python gives

import numpy as np
import scipy as sp

M = np.array([[0.9949357359852791,-0.015567763143324862,-0.09091193493947397,-0.03994428739762443,0.07338356301650806],[0.011813655598647289,0.9968988574699793,-0.06204555000202496,0.04694097614450692,0.09028834462782365],[0.092737943594701,0.059546719185135925,0.9935850721633324,0.025348893985651405,-0.018530261590167685],[0.0369187299165628,-0.04903571106913449,-0.025962938675946543,0.9977767446862031,0.12901494726320517],[0.0,0.0,0.0,0.0,1.0]])

sdiff = sp.linalg.expm(sp.linalg.logm(M)) - M

print(-np.log10(np.max(np.abs(sdiff)))) # > 15
dkarrasch commented 4 months ago

Could you please quickly check which one seems to be wrong: log or exp?

oscardssmith commented 4 months ago

reproduced on x86 linux. The issue appears to be in log (although it's somewhat hard to prove because matrix logs are not unique).

olivierverdier commented 4 months ago

A simpler example:

N = [0.0 -0.8 1.1 0.2 -0.2; 0.8 0.0 0.8 -0.4 -0.1; -1.1 -0.8 0.0 0.4 -0.1; -0.2 0.4 -0.4 0.0 0.0; 0.0 0.0 0.0 0.0 0.0]

k = 10 # also bad for k = 100 or 1000

sdiff = log(exp(N/k)) - N/k
@show -log10(maximum(abs.(sdiff))) # 2.9

I agree with @oscardssmith , I suspect the matrix logarithm to be at fault.


Edit: for completeness, the Python code

N = np.array([[0.0, -0.8, 1.1, 0.2, -0.2], [0.8, 0.0, 0.8, -0.4, -0.1], [-1.1, -0.8, 0.0, 0.4, -0.1], [-0.2, 0.4, -0.4, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0],])

k = 10

sdiff = sp.linalg.logm(sp.linalg.expm(N/k)) - N/k
print(-np.log10(np.max(np.abs(sdiff)))) # 15.0
oscardssmith commented 4 months ago

This second example isn't as good because iiuc, log(exp(M)) == M isn't expected for matrices do to branch cuts.

dkarrasch commented 4 months ago

On v1.6.7 I get

julia> -log10(maximum(abs.(sdiff))) # < 3
15.175451669208671

on v1.9.4 and v1.8.5

julia> -log10(maximum(abs.(sdiff))) # < 3
2.233375639545474

so this must be an "old" bug.

olivierverdier commented 4 months ago

This second example isn't as good because iiuc, log(exp(M)) == M isn't expected for matrices do to branch cuts.

That's why I included the parameter k. If M is small enough, I would expect log(exp(M)) == M to be true no matter what, right?

olivierverdier commented 4 months ago

so this must be an "old" bug.

I suspect the problem to be in log_quasitriu, which was introduced in 6913f9ccb156230f54830c8b7b8ef82b41cac27e, in 2021.

giordano commented 4 months ago

I suspect the problem to be in log_quasitriu, which was introduced in https://github.com/JuliaLang/julia/commit/6913f9ccb156230f54830c8b7b8ef82b41cac27e, in 2021.

That's more recent than Julia v1.6 though

olivierverdier commented 4 months ago

More evidence that the problem is the logarithm:

N = [0.0 -0.8 1.1 0.2 -0.2; 0.8 0.0 0.8 -0.4 -0.1; -1.1 -0.8 0.0 0.4 -0.1; -0.2 0.4 -0.4 0.0 0.0; 0.0 0.0 0.0 0.0 0.0]
k = 10
M = exp(N/k)

sdiff = log(M*M) - 2*log(M)

@show -log10(maximum(abs.(sdiff))) # 2.6

The equivalent code in Python gives 14.7.

olivierverdier commented 4 months ago

I suspect the problem to be in log_quasitriu, which was introduced in 6913f9c, in 2021.

That's more recent than Julia v1.6 though

If it were the case that PR #39973 is to blame (wild guess), then the bug would have appeared first in v1.7.0.

olivierverdier commented 4 months ago

For completeness, I checked the code above (i.e., with log(M*M) - 2*log(M)) with various versions, and indeed, the bug appears between v1.6.7 and v1.7.3. My money is still on PR #39973.

olivierverdier commented 4 months ago

An even simpler test:

function test_log(N, k=10)
    M = exp(N / k)
    sdiff = log(M*M) - 2*log(M)
    return -log10(maximum(abs.(sdiff)))
end

Now simply with

N = [0.0 -1.0 1.0; 1.0 0.0 0.0; 0.0 0.0 0.0]

run

test_log(N) # 2.1

and the same in Python returns 15.4.

dkarrasch commented 4 months ago

It's not clear to me what is the cause. I tried a specific test from scipy's test suite, and this is the result:

julia> VERSION
v"1.10.4"

julia> A = [3.2346e-01 3.0000e+04 3.0000e+04 3.0000e+04;
              0.0000e+00 3.0089e-01 3.0000e+04 3.0000e+04;
              0.0000e+00 0.0000e+00 3.2210e-01 3.0000e+04;
              0.0000e+00 0.0000e+00 0.0000e+00 3.0744e-01];

julia> logA = [ -1.12867982029050462e+00  9.61418377142025565e+04 -4.52485573953179264e+09  2.92496941103871812e+14;
         0.00000000000000000e+00 -1.20101052953082288e+00 9.63469687211303099e+04 -4.68104828911105442e+09;
         0.00000000000000000e+00  0.00000000000000000e+00 -1.13289322264498393e+00  9.53249183094775653e+04;
         0.00000000000000000e+00  0.00000000000000000e+00 0.00000000000000000e+00 -1.17947533272554850e+00];

julia> (logA - log(A)) / norm(logA)
4×4 Matrix{Float64}:
 0.0  5.97008e-25  9.78138e-21  -1.06839e-15
 0.0  0.0          4.97507e-25   1.30418e-20
 0.0  0.0          0.0           2.98504e-25
 0.0  0.0          0.0           0.0

julia> (exp(logA) - A) / norm(A)
4×4 Matrix{Float64}:
 -1.81534e-8  0.000297088  50.5783       -2.79971e6
  0.0         2.33977e-8    0.00116877    8.26782
  0.0         0.0           3.53834e-10  -0.00160443
  0.0         0.0           0.0          -3.34387e-8

julia> (exp(log(A)) - A) / norm(A)
4×4 Matrix{Float64}:
 -1.81534e-8  0.000297088  50.5783       -2.79971e6
  0.0         2.33977e-8    0.00116877    8.26782
  0.0         0.0           3.53834e-10  -0.00160443
  0.0         0.0           0.0          -3.34387e-8

julia> (exp(log(A)) - A)
4×4 Matrix{Float64}:
 -0.001334  21.8314       3.71673e6     -2.05736e11
  0.0        0.00171937  85.8868         6.07558e5
  0.0        0.0          2.60014e-5  -117.901
  0.0        0.0          0.0           -0.00245723

OTOH, we have:

>>> (sp.linalg.expm(A_logm) - A) / sp.linalg.norm(A)
array([[ 0.00000000e+00,  1.33667877e-15,  3.49612787e-10,
        -5.04994630e-07],
       [ 0.00000000e+00,  0.00000000e+00,  1.23766552e-15,
         4.78558722e-11],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.38618539e-15],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00]])

So, the computed log is really close to the "known" solution, but the exp of it is completely wrong in the upper right corner.

ADDENDUM: It seems, however, that this specific example fails on all versions starting from 1.6.x.

mikmoore commented 4 months ago

I tried a specific test from scipy's test suite

It's worth noting that the condition number of that matrix is awful:

julia> cond(A)
1.8884133342622014e20

julia> cond(logA)
4.086839610043594e28

I have an independent myexp implementation (our actual exp implementation is better but has limited type support #51008) and it similarly struggles with this:

function myexp(A)
    # compute exp(A) == exp(A/2^n)^(2^n)
    iszero(A) && return float(one(A))
    # TODO: decide the best norm to use here: op-1, op-Inf, or frobenius are probably good candidates - maybe take the min of all three
    nA = float(norm(A,2))
    fr,ex = frexp(nA)
    logscale = max(4 + ex,0) # ensure nA^(4+1)/factorial(4+1) < eps for 4 or whatever power our approximation is
    p = ntuple(i->inv(prod(1.0:i-1)),Val(9))
    scale = exp2(-logscale)
    sA = A * scale
    if sA isa AbstractMatrix
        expA = evalpoly(sA,UniformScaling.(p))
    else
        expA = evalpoly(sA,p)
    end
    for _ in 1:logscale
        expA = expA*expA
    end
    return expA
end

And with this and BigFloat:

julia> norm(exp(logA) - A) / norm(A)
2.799714043033207e6

julia> norm(myexp(logA) - A) / norm(A) # myexp is generally worse
5.134964610902018e6

julia> norm(myexp(big.(logA)) - A) / norm(A) |> Float64 # despite being worse, BigFloat makes this version work
8.141481015780612e-7

Note that the squaring part of our exp routine is probably very-much to blame, given that BigFloat resolves that problem even though the exp kernel is not expanded beyond what is roughly needed for Float64. I don't think we should use that example for this particular issue.

However, myexp does nothing to resolve the original example:

julia> cond(M) # extremely well conditioned
1.1906431494194285

julia> norm(exp(log(M)) - M) / norm(M) |> Float64
0.003052024023511132

julia> norm(myexp(big.(log(M))) - M) / norm(M) |> Float64
0.0030520240235111305

Given that the matrix exponential is reasonably "simple" to compute (modulo stability issues, which even the literature has not fully settled), I am much more inclined towards expecting that log is the culprit.

mikmoore commented 4 months ago

Confirmation that log is the culprit:

function eigapply(fun, A)
    D,V = eigen(A)
    return V * Diagonal(fun.(D)) / V
end
julia> norm(exp(log(M)) - M) / norm(M)
0.003052024023511132

julia> norm(exp(eigapply(log,M)) - M) / norm(M)
6.499342713344915e-16
olivierverdier commented 4 months ago

I think the problem is in log_quasitriu. From the example above (i.e., qtriu = schur(exp(N/10)).T):

qtriu = [0.9950041652780259 -0.09983341664682815 0.07153667745275474; 0.09983341664682804 0.9950041652780259 0.06981527929449967; 0.0 0.0 1.0]

Now

LinearAlgebra.log_quasitriu(qtriu)[:,end]

gives

0.07240564253650691
 0.06891743457599916
 0.0

and in Python, the same vector is

0.07496781758158656
 0.06618025632357401
 0.0

Pretty big discrepancy starting at the second decimal.

oscardssmith commented 4 months ago

So this code is a very direct matlab translation with all the oddness that entails, but what I've found is that if we make it so s,m = 1,4 rather than 0,5 we get the right answer. see https://github.com/JuliaLang/julia/pull/54867

aravindh-krishnamoorthy commented 2 months ago

Just one more pointer to add here for further analysis. Kinda good that the differences are localised.

julia> using LinearAlgebra

julia> M = [0.9949357359852791 -0.015567763143324862 -0.09091193493947397 -0.03994428739762443 0.07338356301650806; 0.011813655598647289 0.9968988574699793 -0.06204555000202496 0.04694097614450692 0.09028834462782365; 0.092737943594701 0.059546719185135925 0.9935850721633324 0.025348893985651405 -0.018530261590167685; 0.0369187299165628 -0.04903571106913449 -0.025962938675946543 0.9977767446862031 0.12901494726320517; 0.0 0.0 0.0 0.0 1.0]
5×5 Matrix{Float64}:
 0.994936   -0.0155678  -0.0909119  -0.0399443   0.0733836
 0.0118137   0.996899   -0.0620456   0.046941    0.0902883
 0.0927379   0.0595467   0.993585    0.0253489  -0.0185303
 0.0369187  -0.0490357  -0.0259629   0.997777    0.129015
 0.0         0.0         0.0         0.0         1.0

julia> S = schur(complex(M));

julia> norm(exp(S.Z*LinearAlgebra.log_quasitriu(S.T)*S.Z') - M)
1.5337045591657124e-15

julia> S = schur(M);

julia> norm(exp(S.Z*LinearAlgebra.log_quasitriu(S.T)*S.Z') - M)
0.0068453336198366545

julia> exp(S.Z*LinearAlgebra.log_quasitriu(S.T)*S.Z') - M
5×5 Matrix{Float64}:
 -6.66134e-16  -3.72966e-16   2.498e-16    -1.38778e-17  -0.00185147
 -3.34802e-16   5.55112e-16  -2.15106e-16   4.85723e-17   0.00300777
 -6.93889e-17  -2.70617e-16  -2.22045e-16  -2.08167e-17   0.00584284
 -1.249e-16    -6.245e-17    -7.63278e-17   0.0          -0.000495109
  0.0           0.0           0.0           0.0           0.0

Eigenvectors and eigenvalues of M and exp(log(M)):

julia> using LinearAlgebra

julia> M = [0.9949357359852791 -0.015567763143324862 -0.09091193493947397 -0.03994428739762443 0.07338356301650806; 0.011813655598647289 0.9968988574699793 -0.06204555000202496 0.04694097614450692 0.09028834462782365; 0.092737943594701 0.059546719185135925 0.9935850721633324 0.025348893985651405 -0.018530261590167685; 0.0369187299165628 -0.04903571106913449 -0.025962938675946543 0.9977767446862031 0.12901494726320517; 0.0 0.0 0.0 0.0 1.0]
5×5 Matrix{Float64}:
 0.994936   -0.0155678  -0.0909119  -0.0399443   0.0733836
 0.0118137   0.996899   -0.0620456   0.046941    0.0902883
 0.0927379   0.0595467   0.993585    0.0253489  -0.0185303
 0.0369187  -0.0490357  -0.0259629   0.997777    0.129015
 0.0         0.0         0.0         0.0         1.0

julia> E1 = eigen(M);

julia> E2 = eigen(exp(log(M)));

julia> norm(E1.values - E2.values)
1.2787159664228656e-15

julia> norm(E1.vectors - E2.vectors)
0.028242416081603217

julia> abs.(E1.vectors - E2.vectors)
5×5 Matrix{Float64}:
 3.33356e-16  3.33356e-16  4.54082e-15  4.54082e-15  0.0120785
 1.34378e-15  1.34378e-15  4.87741e-15  4.87741e-15  0.0141492
 6.66134e-16  6.66134e-16  4.26807e-15  4.26807e-15  0.00359654
 3.34869e-16  3.34869e-16  0.0          0.0          0.0209428
 0.0          0.0          0.0          0.0          9.55047e-5

The likely "correct" answer based on Schur-Parlett recurrence algorithm (unique roots required):

julia> S = schur(M);

julia> norm(exp(S.Z*real(MatrixFunctions.fm_schur_parlett_recurrence(log, S.T))*S.Z') - M)
1.1184124283538868e-15

julia> norm(exp(S.Z*LinearAlgebra.log_quasitriu(S.T)*S.Z') - M)
0.0068453336198366545

julia> real(MatrixFunctions.fm_schur_parlett_recurrence(log, S.T))
5×5 Matrix{Float64}:
 -1.09981e-15   0.117707      4.04932e-15  2.60264e-15   0.141422
 -0.117707     -1.09981e-15   1.77951e-15  3.03452e-16   0.0482574
  0.0           0.0           2.99457e-16  0.0544547    -0.0213389
  0.0           0.0          -0.0544547    2.99457e-16   0.0881419
  0.0           0.0           0.0          0.0           0.0

julia> LinearAlgebra.log_quasitriu(S.T)
5×5 Matrix{Float64}:
 -1.10068e-15   0.117707      4.06711e-15  2.69458e-15   0.143336
 -0.117707     -1.10068e-15   1.59252e-15  2.22818e-16   0.0419473
  0.0           0.0           2.9989e-16   0.0544547    -0.0195324
  0.0           0.0          -0.0544547    2.9989e-16    0.0885489
  0.0           0.0           0.0          0.0           0.0
aravindh-krishnamoorthy commented 2 months ago

Now, looking a bit more into the code, I'm confused.

The function _log_quasitriu! https://github.com/JuliaLang/julia/blob/b6d2155e65bbbf511a78aadf8e21fcec51e525c4/stdlib/LinearAlgebra/src/triangular.jl#L1947 utilizes https://github.com/JuliaLang/julia/blob/b6d2155e65bbbf511a78aadf8e21fcec51e525c4/stdlib/LinearAlgebra/src/triangular.jl#L1949 to compute the Pade approximation parameters m and s. However, the function https://github.com/JuliaLang/julia/blob/b6d2155e65bbbf511a78aadf8e21fcec51e525c4/stdlib/LinearAlgebra/src/triangular.jl#L1998 is based on [R1], which is only for triangular matrices and not quasi-triangular matrices. Nevertheless, it is applied to both triangular and quasi-triangular matrices. Furthermore, the suggested algorithm [16, Alg. 6.3] in [R1] for T^(1/2) is replaced with _sqrt_quasitriu! in [R2] (according to the function header).

Question:

Note: I've taken note of @oscardssmith's fix where using s=1, m=5 results in a good solution instead of s=0, m=6 obtained from _find_params_log_quasitriu!. However, I'm not sure if forcing a square root in every case is the right approach.

[R1]: Al-Mohy, Awad H. and Higham, Nicholas J. "Improved Inverse Scaling and Squaring Algorithms for the Matrix Logarithm", 2011, url: https://eprints.maths.manchester.ac.uk/1687/1/paper11.pdf [R2]: Deadman, Edvin and Higham, Nicholas J. and Ralha, Rui, "Blocked Schur Algorithms for Computing the Matrix Square Root," 2012, url: https://eprints.maths.manchester.ac.uk/1926/1/EdvinDeadmanPARA2012.pdf

aravindh-krishnamoorthy commented 2 months ago
  • Has anyone studied whether the algorithm in [R1] works with quasi-triangular matrices?

This does not seem to be the immediate problem. However, using the diagonal elements in the following seems wrong. https://github.com/JuliaLang/julia/blob/34b81fbc90a96c1db7f235a465d2cfdf5937e563/stdlib/LinearAlgebra/src/triangular.jl#L2030

Choosing d as the eigenvalues of A seems to be a better choice for quasi-triangular matrices. The rest of the algorithm seems Ok for quasi-triangular matrices.

Note: Nevertheless, for the given M in this issue, using eigenvalues instead of diagonal values does not cause a change in the computed parameters m = 6 and s = 0. This will need to be taken up separately.

Next stop: double-checking the Pade approximation for m=6, s=0 as computed by the current implementation of _find_params_log_quasitriu!.

aravindh-krishnamoorthy commented 2 months ago

The issue is in the function https://github.com/JuliaLang/julia/blob/34b81fbc90a96c1db7f235a465d2cfdf5937e563/stdlib/LinearAlgebra/src/triangular.jl#L2169 where the case $s = 0$ is not handled. In fact, the function faithfully implements Algorithm 5.1 in [R1]. However, unfortunately, Algorithm 5.1 in [R1] seems incomplete and only applicable for $s \geq 1.$

The following is a quick fix that works for this case (changes are made to the code in the above commit). I will have to think a bit more before suggesting a permanent solution.

Base.@propagate_inbounds function _sqrt_pow_diag_block_2x2!(A, A0, s)
+    if iszero(s)
+        A[1,1] -= 1
+        A[2,2] -= 1
+        return
+    end
    _sqrt_real_2x2!(A, A0)
    if isone(s)
        A[1,1] -= 1
        A[2,2] -= 1
    else

With this change, we have

julia> norm(exp(S.Z*log_quasitriu(S.T)*S.Z') - M)
1.1192892392171394e-15
aravindh-krishnamoorthy commented 1 month ago

This code looks good also for long term. @oscardssmith would you like to update your PR's title/code and provide this fix instead? To avoid duplication of effort and the fact that you got started with this...

Update 24-Oct-2024: I will create a PR with this fix.