Bump compat for Enzyme.jl to 0.11.2 or higher

toollu commented 1 year ago

I'm hitting Enzyme.jl #644 via optprob = OptimizationFunction(foo, Optimization.AutoEnzyme(), cons = cons). The Enzyme fix is not able to land on master here due to d7a4945. Any information or reasoning on that so I can try to help out?

Vaibhavdixit02 commented 1 year ago

Enzyme 0.11.2 and higher introduce https://github.com/EnzymeAD/Enzyme.jl/issues/931 when computing hessians hence it was bounded.

Can you show the code you have that hits https://github.com/EnzymeAD/Enzyme.jl/issues/644,maybe it can be rewritten to avoid that?

toollu commented 1 year ago

Can you show the code you have that hits EnzymeAD/Enzyme.jl#644 it can be rewritten to avoid that?

Sure. The packge we are developing is quite complex so I tried to reduce it to the minmum that represents what we try to achieve. Consider the module and variables:

module Nodes

abstract type Node end

struct Source{T}<:Node where T<:Real
    output::Vector{T}
end

struct Multiplicator{T}<:Node where T<:Real
    factors::Vector{T}
    output::Vector{T}
    upstream_node::Node
end

struct Sink{T}<:Node where T<:Real
    target_value::Vector{T}
    target_fraction::Vector{T}
    upstream_node::Node
end

function calculate(node::Multiplicator)
    @. node.output = (node.factors) * node.upstream_node.output
end

function calculate(node::Sink)
    @. node.target_fraction = node.upstream_node.output / node.target_value
end

end

begin
    using Enzyme
    using .Nodes
    using Statistics
    Enzyme.API.runtimeActivity!(true)

    const system_size = 1000

    node_1 = Nodes.Source(fill(10.0,system_size))
    node_2 = Nodes.Multiplicator(randn(system_size).+1,zeros(system_size),node_1)
    node_3 = Nodes.Sink(fill(11.0,system_size),zeros(system_size), node_2)

    factor_matrix =  [fill(1.2,system_size) fill(0.9,system_size) fill(1.0,system_size) fill(1.1,system_size)]
    factor_matrix.+= randn(system_size,4).*0.1
end

So on Enzyme 0.11.2


x = [0.2, 0.2, 0.2, 0.4]
dx = zeros(4)

function vec_input_target(weights)::Float64
    factors = factor_matrix*weights
    node_2.factors .= factors

    Nodes.calculate(node_2)
    Nodes.calculate(node_3)

    y = (abs(1 -mean(node_3.target_fraction)))

    return y
end

autodiff(Reverse,vec_input_target,Active,Duplicated(x,dx))

Actually works. On 0.11.0 however it fails with the same error as:

begin
    using Optimization, OptimizationMOI, OptimizationOptimJL, Ipopt
    using ForwardDiff, ModelingToolkit, Enzyme
end
cons(res,x,p) = (res.= [sum(x), x[1],x[2], x[3], x[4]])

 lb = [1.0, 0.0, 0.0, 0.0, 0.0]
 ub = [1.0, 1.0, 1.0, 1.0, 1.0]

optprob = OptimizationFunction((x,p)->vec_input_target(x), Optimization.AutoEnzyme(), cons = cons)
prob = OptimizationProblem(optprob, x, facs, lcons = lb, ucons = ub)
sol = solve(prob, Ipopt.Optimizer()) # Enzyme execution failed. Enzyme: not yet implemented in reverse mode, jl_getfield

An additional problem seems to be that the necessary closure in OptimizationFunction seems to introduce a type instability, so vec_input_target(weights) needs the return type annotation to Float64 in order not to hit EnzymeAD/Enzyme.jl#741 But I'll open a seperate issue on that if it should still persist after closing this one. Julia Discourse Xref

wsmoses commented 1 year ago

@Vaibhavdixit02 what code are you referring to by Enzyme 0.11.2 and higher introduce https://github.com/EnzymeAD/Enzyme.jl/issues/931 when computing hessians hence it was bounded.

That particular bug was in GPUCompiler (a fix has landed for that upstream), and requires a very specific input type condition (which unfortunately applied for that issue's MarcusHushChidseyDOS{Float64}). I think it unlikely (but not impossible) you're conflating distinct issues, so I'm curious what you are referring to.

wsmoses commented 1 year ago

Separately you should not be making closures like here: https://github.com/SciML/Optimization.jl/blob/edad8ddb1ea2507da54b46e7839dc5dfc3b56ac6/ext/OptimizationEnzymeExt.jl#L16 (this is a potential source of type instabilities and more issues), but passing the extra args directly, in the relevant autodiff call.

Vaibhavdixit02 commented 1 year ago

That particular bug was in GPUCompiler (a fix has landed for that upstream), and requires a very specific input type condition (which unfortunately applied for that issue's MarcusHushChidseyDOS{Float64}). I think it unlikely (but not impossible) you're conflating distinct issues, so I'm curious what you are referring to.

I see! https://github.com/SciML/Optimization.jl/actions/runs/5521696377/jobs/10070185119#step:7:792 was the error. It comes up when doing the hessian call here https://github.com/SciML/Optimization.jl/blob/master/test/ADtests.jl#L70-L75. The implementation for the hessian is picked from the Enzyme docs specifically the vector forward over reverse part and is here https://github.com/SciML/Optimization.jl/blob/edad8ddb1ea2507da54b46e7839dc5dfc3b56ac6/ext/OptimizationEnzymeExt.jl#L23-L43

I can try to come up with a MWE if this is too deep in this package 🙏

Vaibhavdixit02 commented 1 year ago

(this is a potential source of type instabilities and more issues), but passing the extra args directly, in the relevant autodiff call.

Sure, I think I had tried that early on but failed to get it to work I'll give it another try, thanks for the suggestion!

wsmoses commented 1 year ago

Here, I revised the tutorial to use a better method of computing hessians. https://enzyme.mit.edu/index.fcgi/julia/dev/generated/autodiff/

SciML / Optimization.jl

Bump compat for Enzyme.jl to 0.11.2 or higher #564