JuliaDiff / ForwardDiff.jl

Forward Mode Automatic Differentiation for Julia
Other
891 stars 143 forks source link

Compatibility with Base linear algebra functions #111

Open rand5 opened 8 years ago

rand5 commented 8 years ago

Hi, I am interested in calculating the hessian of a function that requires calls to multiple other built-in and local functions. What is otherwise a run-able function quickly becomes incompatible with the Hessian type. Here is a simple example illustrating the issue:

function TestFun(x::Vector) Sigma = [x[1]^2 x[1] * x[2] * x[3]; x[1] * x[2] * x[3] x[2]^2]; lambda = eigvals(Sigma); ldV = sum(log(lambda)); return ldV end

Generate an arbitrary input vector:

W = rand(3)

Compute the hessian:

hess = ForwardDiff.hessian(TestFun, W)

Which produces the error,

LoadError: MethodError: eigvals! has no method matching eigvals!(::Array{ForwardDiff.HessianNumber{3,Float64,Tuple{Float64,Float64,Float64}},2}) while loading In[331], in expression starting on line 1

in eigvals at linalg/eigen.jl:92 in TestFun at In[329]:3 in _calc_hessian at C:\Users....julia\v0.4\ForwardDiff\src\api\hessian.jl:98 in hessian at C:\Users....julia\v0.4\ForwardDiff\src\api\hessian.jl:27

In this case, it is easy enough to avoid the call to eigvals by hardcoding the equations for the eigenvalues since Sigma is just a 2x2 matrix, but this is far from ideal for larger matrices and doesn't address the underlying difficulty of passing along Hessian type variables.

I am curious if anyone has thoughts on solutions/best practices. Any help is greatly appreciated. Thanks!

jrevels commented 8 years ago

Thanks for reporting this! Technically this isn't a case we support yet. ForwardDiff only works with native Julia code, and Julia's built-in eigvals implementation doesn't include a generic Julia version - it wraps LAPACK. This means that we can't pass ForwardDiff's special Julia number types through the function to collect derivative information.

Work is being done on a native Julia implementation for all these linear algebra functions (see LinearAlgebra.jl, cc @andreasnoack), but it's not finished yet. Once it is, we should be able to re-export that package from ForwardDiff so that these functions work for end-users without any extra effort.

If you have another way to get the Hessian of eigvals(Sigma) for your specific use case, #89 might be helpful to you.

baggepinnen commented 6 years ago

I tried eigvals! from LinearAlgebra.jl but encountered

julia> ForwardDiff.jacobian(LinearAlgebra.EigenGeneral.eigvals!, a)
ERROR: StackOverflowError:
Stacktrace:
 [1] sqrt(::Complex{ForwardDiff.Dual{9,Float64}}) at ./complex.jl:416 (repeats 47579 times)
 [2] #eigvals!#6(::Float64, ::Function, ::LinearAlgebra.EigenGeneral.Schur{ForwardDiff.Dual{9,Float64},Array{ForwardDiff.Dual{9,Float64},2}}) at /local/home/fredrikb/.julia/v0.6/LinearAlgebra/src/eigenGeneral.jl:231
 [3] eigvals!(::Array{ForwardDiff.Dual{9,Float64},2}) at /local/home/fredrikb/.julia/v0.6/LinearAlgebra/src/eigenGeneral.jl:212
 [4] vector_mode_jacobian(::LinearAlgebra.EigenGeneral.#eigvals!, ::Array{Float64,2}, ::ForwardDiff.JacobianConfig{9,Float64,Array{ForwardDiff.Dual{9,Float64},2}}) at /local/home/fredrikb/.julia/v0.6/ForwardDiff/src/jacobian.jl:75
 [5] jacobian at /local/home/fredrikb/.julia/v0.6/ForwardDiff/src/jacobian.jl:7 [inlined]
 [6] jacobian(::LinearAlgebra.EigenGeneral.#eigvals!, ::Array{Float64,2}) at /local/home/fredrikb/.julia/v0.6/ForwardDiff/src/jacobian.jl:6

Not sure if the error is here or there, but I found this already opened issue so decided to try here first =)

ChrisRackauckas commented 6 years ago

That is somewhat of a duplicate of https://github.com/JuliaDiff/ForwardDiff.jl/issues/157

dlfivefifty commented 4 years ago

It should be possible to find the Jacobian of eigenvalues of a symmetric matrix: I believe if A(c) is a symmetric matrix depending on a vector c then (generically) each eigenvalue λ_k satisfies

dλ_k/dc_j = q_k(c)'(dA/dc_j)*q_k(c)

where q_k(c) is the eigenvector associated to λ_k (This follows from differentiating q_k(c)'q_k(c) = 1 and λ_k = q_k(c)'A(c)*q_k(c).

I could try to make a PR to support this so that the following code would work:

julia> λ = c -> eigvals(Symmetric([c[1] 1.0; 1.0 c[2]]))
#3 (generic function with 1 method)

julia> λ([0.1,0.2])
2-element Array{Float64,1}:
 -0.8512492197250391
  1.1512492197250395

julia> ForwardDiff.jacobian(λ, [0.1,0.2])
ERROR: MethodError: no method matching eigvals!(::Symmetric{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},2}})
Closest candidates are:
  eigvals!(::SymTridiagonal{#s664,V} where V<:AbstractArray{#s664,1} where #s664<:Union{Float32, Float64}) at /Users/solver/Projects/julia-1.4/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/tridiag.jl:245
  eigvals!(::SymTridiagonal{#s664,V} where V<:AbstractArray{#s664,1} where #s664<:Union{Float32, Float64}, ::UnitRange) at /Users/solver/Projects/julia-1.4/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/tridiag.jl:248
  eigvals!(::SymTridiagonal{#s664,V} where V<:AbstractArray{#s664,1} where #s664<:Union{Float32, Float64}, ::Real, ::Real) at /Users/solver/Projects/julia-1.4/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/tridiag.jl:253
  ...
Stacktrace:
 [1] eigvals(::Symmetric{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},2}}) at /Users/solver/Projects/julia-1.4/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/symmetric.jl:723
 [2] (::var"#3#4")(::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},1}) at ./REPL[2]:1
 [3] vector_mode_dual_eval(::var"#3#4", ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#3#4",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},1}}) at /Users/solver/.julia/packages/ForwardDiff/cXTw0/src/apiutils.jl:37
 [4] vector_mode_jacobian(::var"#3#4", ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#3#4",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},1}}) at /Users/solver/.julia/packages/ForwardDiff/cXTw0/src/jacobian.jl:140
 [5] jacobian(::Function, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#3#4",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},1}}, ::Val{true}) at /Users/solver/.julia/packages/ForwardDiff/cXTw0/src/jacobian.jl:17
 [6] jacobian(::Function, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#3#4",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4",Float64},Float64,2},1}}) at /Users/solver/.julia/packages/ForwardDiff/cXTw0/src/jacobian.jl:15 (repeats 2 times)
 [7] top-level scope at REPL[4]:1
dlfivefifty commented 4 years ago

Though is this package the right place for this? I seem to remember there's a ForwardDiff.jl replacement in the works...

dlfivefifty commented 4 years ago

It turned out to be pretty easy to implement:

import ForwardDiff: jacobian, Dual
import LinearAlgebra: eigvals

function eigvals(A::Symmetric{<:Dual{Tg,T,N}}) where {Tg,T<:Real,N}
    λ,Q = eigen(Symmetric(getproperty.(parent(A), :value)))
    partials = ntuple(j -> diag(Q' * getindex.(getproperty.(A, :partials), j) * Q), N)
    Dual{Tg}.(λ, tuple.(partials...))
end

This seems to work:

julia> A = c -> Symmetric([c[1] c[1]; 0 c[2]+2c[3]]);

julia> λ = c -> eigvals(A(c));

julia> jacobian(λ, [0.1,0.2,0.3])
2×3 Array{Float64,2}:
 0.706041  0.019238  0.0384761
 0.293959  0.980762  1.96152

julia> h = 0.00001; (λ([0.1+h,0.2,0.3]) - λ([0.1,0.2,0.3]))/h
2-element Array{Float64,1}:
 0.7060242588924347
 0.2939757411057897

julia> h = 0.00001; (λ([0.1,0.2+h,0.3]) - λ([0.1,0.2,0.3]))/h
2-element Array{Float64,1}:
 0.019237767014124163
 0.9807622329716102

julia> h = 0.00001; (λ([0.1,0.2,0.3+h]) - λ([0.1,0.2,0.3]))/h
2-element Array{Float64,1}:
 0.038475015702588156
 1.9615249842841462

Thanks @marcusdavidwebb for the pointer on differentiating eigenvalues.

Would a PR into ForwardDiff.jl be appropriate?

dlfivefifty commented 4 years ago

And this gives the eigenvectors:

# A ./ (λ - λ') but with diag special cased
function _lyap_div!(A, λ)
    for (j,μ) in enumerate(λ), (k,λ) in enumerate(λ)
        if k ≠ j
            A[k,j] /= μ - λ
        end
    end
    A
end

function eigen(A::Symmetric{<:Dual{Tg,T,N}}) where {Tg,T<:Real,N}
    λ = eigvals(A)
    _,Q = eigen(Symmetric(value.(parent(A))))
    parts = ntuple(j -> Q*_lyap_div!(Q' * getindex.(partials.(A), j) * Q - Diagonal(getindex.(partials.(λ), j)), value.(λ)), N)
    Eigen(λ,Dual{Tg}.(Q, tuple.(parts...)))
end

E.g.:

julia> A = c -> Symmetric([c[1] c[1]; 0 c[2]+2c[3]]);

julia> Q = c -> vec(eigen(A(c)).vectors);

julia> jacobian(Q, [0.1,0.2,0.3])
4×3 Array{Float64,2}:
  0.20936  -0.02617   -0.0523401
  1.49484  -0.186856  -0.373711
  1.49484  -0.186856  -0.373711
 -0.20936   0.02617    0.0523401

julia> h = 0.00001; (Q([0.1+h,0.2,0.3]) - Q([0.1,0.2,0.3]))/h
4-element Array{Float64,1}:
  0.20937278679689084
  1.4948510676571212
  1.4948510676571212
 -0.20937278679689084
rlk8 commented 4 years ago

Elegant solution.

Lightup1 commented 2 years ago

Is there a way to calculate differentiation of f(eigvals(::Hermitian)) where f(x) is pure julia? @dlfivefifty

dlfivefifty commented 2 years ago

It should just work?

Lightup1 commented 2 years ago

I just got this error message:

ERROR: MethodError: no method matching eigvals!(::Hermitian{ForwardDiff.Dual{ForwardDiff.Tag{var"#102#103"{Float64, Float64, PositionBasis{-54.0, 9.0, Int64, Float64}, Vector{Float64}, PreallocationTools.DiffCache{Matrix{ComplexF64}, Vector{ComplexF64}}}, Float64}, Float64, 1}, Matrix{ForwardDiff.Dual{ForwardDiff.Tag{var"#102#103"{Float64, Float64, PositionBasis{-54.0, 9.0, Int64, Float64}, Vector{Float64}, PreallocationTools.DiffCache{Matrix{ComplexF64}, Vector{ComplexF64}}}, Float64}, Float64, 1}}})

I use dualcache to cache the large matrix, and its diagonal element is depend on a real parameter a2

function smallest_gap(a2::T, a3, a4, basis_l, diagH, Hdata::PreallocationTools.DiffCache) where {T}
    Hdata=get_tmp(Hdata,a2)
    dx = spacing(basis_l)
    xmin = basis_l.xmin
    # potentialf=x->potential(x,a2, a3, a4)
    # Hview=reinterpret(reshape,T,Hdata)
    @inbounds for i in eachindex(diagH)
        Hdata[i, i] = diagH[i] + potential(xmin + (i - 1) * dx,a2, a3, a4)
    end 
    eigcache = eigvals(Hermitian(Hdata))
    smallest_gap(a2, eigcache)
end