FluxML / Zygote.jl

21st century AD
https://fluxml.ai/Zygote.jl/
Other
1.47k stars 209 forks source link

incorrect jacobian(eigvals, A) if A happens to be symmetric #1369

Open stevengj opened 1 year ago

stevengj commented 1 year ago

Zygote.jacobian(eigvals, A) returns an incorrect answer if A happens to be symmetric: it returns an erroneous column of zeros:

julia> using LinearAlgebra, Zygote

julia> Zygote.jacobian(eigvals, [1 2; 2 + 1e-13 3])[1]    # not quite symmetric — correct answer
2×4 Matrix{Float64}:
 0.723607  -0.447214  -0.447214  0.276393
 0.276393   0.447214   0.447214  0.723607

julia> Zygote.jacobian(eigvals, [1 2; 2 3])[1]    # symmetric — incorrect column of zeros
2×4 Matrix{Float64}:
 0.723607  0.0  -0.894427  0.276393
 0.276393  0.0   0.894427  0.723607

I think what's happening is related to the fact that Julia's eigvals(A) checks ishermitian(A) and if so calls an optimized eigvals(Hermitian(A)) function that only looks at the upper triangle of A. Zygote apparently looks at this and erroneously concludes that the derivative with respect to the lower triangle is zero, when in fact if you perturb the lower triangle Julia will call a different non-Hermitian algorithm.

(I'm not sure if the bug is in Zygote or ChainRules, to be honest.)

mcabbott commented 1 year ago

I think this is from ChainRules:

julia> using ChainRules, Zygote

julia> ChainRules.rrule(eigvals, [1 2; 2 + 1e-13 3])[2]([0 1; 0 1])[2]
2×2 Matrix{Float64}:
 0.276393  0.447214
 0.447214  0.723607

julia> ChainRules.rrule(eigvals, [1 2; 2 3])[2]([0 1; 0 1])[2]
2×2 Matrix{Float64}:
 0.0  0.0
 0.0  1.0

probably here (from https://github.com/JuliaDiff/ChainRules.jl/pull/323):

https://github.com/JuliaDiff/ChainRules.jl/blob/1f4a8a9d86c79f024a911f61aa180bdc094bb8a3/src/rulesets/LinearAlgebra/factorization.jl#L318

Searching for ishermitian finds other similar checks.

ForwardDiff more often differentiates the implementation and is (or was) more prone to this problem, of promoting an accidental zero (or symmetry) into a structural one. https://github.com/JuliaDiff/ForwardDiff.jl/issues/480 is the issue about this.

stevengj commented 1 year ago

Ah, thanks. cc @sethaxen.

Would be good to file an issue in ChainRules then, but would be even better to file a PR to fix the problem. 😉

sethaxen commented 1 year ago

Agreed, this should definitely be fixed. Probably as simple as using a hermitian projection (which is I think a utility function elsewhere in ChainRules) instead of using triu!.