JuliaMath / Calculus.jl

Calculus functions in Julia
Other
275 stars 78 forks source link

Calculus.hessian is not symmetric #91

Open RossBoylan opened 8 years ago

RossBoylan commented 8 years ago
julia> @time h10 = Calculus.hessian(RB.mylike, RB.troubleParamsSim1.raw)
 209.557745 seconds (2.43 G allocations: 83.867 GB, 10.31% gc time)
 10x10 Array{Float64,2}:
    566.715     -206.282      -50.7658   …    3135.03        -89.8427
   -206.282      566.715       17.7032       -1993.75         47.1135
    -50.7658      17.7032     143.437        -1163.21         33.7016
     17.7032     -50.7658     -47.7584         282.533         1.3897
    182.291     -122.875      -12.5876       -2943.26         27.7885
     -2.95777     -5.09084      2.97638  …    -452.285        15.1723
    -45.7494      12.5752      39.2448           0.613878     24.8811
  -1942.38       804.496      450.315       -35319.9        1935.28
   3135.03     -1993.75     -1163.21         56057.6       -2498.87
    -89.8427      47.1135      33.7016       -2498.87       1041.54

 julia> issym(h10)
 false

Closer inspection of h10 shows all the errors are extremely small; the largest is 4e-8.

My original assumption was that this was the result of numerical noise from calculating the derivatives in different orders. But the code (finite_difference_hessian! in finite_difference.jl) appears to compute only the upper triangle,

   for i = 1:n
       #code omitted
        # and i+1:n appears to be parses (i+1):n, which is right
        for j = i+1:n

and then ends with

    Base.LinAlg.copytri!(H,'U')

So maybe the problem is in copytri!.

RossBoylan commented 8 years ago

Same problem after updating to the latest Julia 0.4 code.

   | | |_| | | | (_| |  |  Version 0.4.7-pre+1 (2016-06-19 17:17 UTC)
  _/ |\__'_|_|_|\__'_|  |  Commit 57d0834 (5 days old release-0.4)
 |__/                   |  x86_64-linux-gnu
RossBoylan commented 8 years ago
julia> h11=copy(h10)
 10x10 Array{Float64,2}:
 julia> Base.LinAlg.copytri!(h11, 'U')
 10x10 Array{Float64,2}:

 julia> issym(h11)
 true

 julia> issym(h10)
 false

So the result of copytri! is symmetric. This suggests the problem is elsewhere, or perhaps the code I thought was doing the work isn't.

johnmyleswhite commented 8 years ago

Digging into this to find the exact source of asymmetry would lead to a great pull request.

RossBoylan commented 8 years ago

Using Debug and StackTraces (is there a better way?) I think I may see the problem. The call seems to be routed to the evaluation of the Jacobian of the numerical gradient. The apparent fix would be to send it to finite_difference_hessian{T <: Number}(f::Function, x::Vector{T}) instead. Is there any reason that wasn't being done, e.g., that code is broken?

There are a number of function definitions for hessian in derivative.jl that have no gradient argument and create one numerically. If the suggested fix is safe, it probably should be done for all.

Even if the use of the jacobian worked it appears to involve almost twice as much work as necessary, since it evaluates the derivates for both i, j and j,i. And even if one is using a real gradient, isn't using the current jacobian evaluation likely to be inefficient and probably asymmetric?

It looks as if the call goes to

function hessian{T <: Number}(f::Function, x::Vector{T})
  finite_difference_hessian(f, gradient(f), x, :central)  # derivative.jl:80
end

# leads to
function finite_difference_hessian{T <: Number}(f::Function,
                                                g::Function,
                                                x::Vector{T},
                                                dtype::Symbol = :central)
    finite_difference_jacobian(g, x, dtype)
end

#trace, see qualifications later
StackTraces.StackFrame(:anonymous,symbol("no file"),0,symbol(""),-,false,139887623193497)
StackTraces.StackFrame(:eval,symbol("/home/ross/PCORI/trouble.jl"),1,symbol(""),-,false,139887623176800)
StackTraces.StackFrame(:debug_eval,symbol("/home/ross/.julia/v0.4/Debug/src/Eval.jl"),23,symbol(""),-,false,139887623098583)
StackTraces.StackFrame(:eval_in_scope,symbol("/home/ross/.julia/v0.4/Debug/src/UI.jl"),106,symbol(""),-1,false,139887623096016)
StackTraces.StackFrame(:trap,symbol("/home/ross/.julia/v0.4/Debug/src/UI.jl"),82,symbol(""),-1,false,139887623085585)
StackTraces.StackFrame(:logitlite,symbol("/home/ross/PCORI/trouble.jl"),324,symbol("no file"),0,false,139887623175194)
StackTraces.StackFrame(:mylike,symbol("/home/ross/PCORI/trouble.jl"),493,symbol(""),-,false,139887623171845)
StackTraces.StackFrame(:finite_difference!,symbol("/home/ross/.julia/v0.4/Calculus/src/finite_difference.jl"),126,symbol(""),-,false,139887624133468)
StackTraces.StackFrame(:finite_difference,symbol("/home/ross/.julia/v0.4/Calculus/src/finite_difference.jl"),145,symbol(""),-,false,139887624132974)
StackTraces.StackFrame(:anonymous,symbol("/home/ross/.julia/v0.4/Calculus/src/derivative.jl"),5,symbol("no file"),0,false,139887624132686)
StackTraces.StackFrame(:hessian,symbol("/home/ross/.julia/v0.4/Calculus/src/derivative.jl"),80,symbol(""),-,false,139887624131729)
StackTraces.StackFrame(:eval_user_input,symbol("REPL.jl"),62,symbol(""),-,false,13988762622508# etc

The qualification is that the jacobian function does not appear in the stack trace. I put the breakpoint in the start of my objective function, as a result of which it's called in the first evaluation of the gradient. So I think the stack is that for the closure in which the gradient function is defined via

function derivative(f::Function, ftype::Symbol, dtype::Symbol)
  if ftype == :scalar
    return x::Number -> finite_difference(f, float(x), dtype)
  elseif ftype == :vector
    return x::Vector -> finite_difference(f, float(x), dtype)
  else
    error("ftype must :scalar or :vector")
  end
end

At the top of derivative.jl (the line after:vector does appear in the call stack).

RossBoylan commented 8 years ago
# in test/derivative.jl
# PR 91
f6(x) = sin(x[1]^2+3x[2]^4)
@test issym(Calculus.hessian(f6, [1.0, 2.0]))

Moving on to solutions... Interestingly, the f5 function that is there now passes the test.

pkofod commented 7 years ago

I've been testing the finite difference functionality for Hessians in https://github.com/JuliaOpt/Optim.jl/pull/337 and I agree that there's a problem.

https://gist.github.com/pkofod/b4e0338265fac7ab1f4aa87a7d3d655f

So the problem I posted has a diagonal matrix as its Hessian, so something is off here. How can it create off-diagonal entries (larger in magnitude than the "real" entries"), when the cross derivatives should be exactly zero? My guess is that the Hessian is not calculated according to the usual method of calculating all combinations, but it rather uses some method where it changes many values at a time? Then maybe it should be possible to switch to the naïve/slow method.

I don't have time to look through the Calculus code right now, sorry. If it doesn't get resolved for a while, I might be able to set some time aside at some point.