jump-dev / JuMP.jl

Modeling language for Mathematical Optimization (linear, mixed-integer, conic, semidefinite, nonlinear)
http://jump.dev/JuMP.jl/
Other
2.24k stars 396 forks source link

NLP Interface Error Message with Registered Functions #2910

Closed ccoffrin closed 2 years ago

ccoffrin commented 2 years ago

This is a case where the user makes an implementation error and the message from JuMP could be improved.

In the following example the user means to put @NLobjective(m, Min, f2(x[1],x[2])) in the second model but uses @NLobjective(m, Min, f(x[1],x[2])) instead. This results in a MethodError deep in JuMP's NLP implementation. Oddly, if the function f is sufficiently simple the issue does not occur.

MWE

using JuMP
using Ipopt

# works without a problem
# function f(x...)
#     return x[1]^2 + x[2]^2 + 2
# end

# a function of this style of complexity is required to produce the issue
function f(x...)
    mf = Model(Ipopt.Optimizer)

    @variable(mf, y[1:2])
    @NLobjective(mf, Max, y[1]*x[1] + y[2]*x[2] - x[1]*y[1]^4 - 2*x[2]*y[2]^4)
    @constraint(mf, (y[1]-10)^2 + (y[2]-10)^2 <= 25)
    optimize!(mf)

    return objective_value(mf)
end

m = Model(Ipopt.Optimizer)
@variable(m, x[1:2]>=0)
JuMP.register(m, :f, 2, f, autodiff=true)
@NLobjective(m, Min, f(x[1],x[2]))
optimize!(m)

function f2(x...)
    return x[1]^2 + x[2]^2 + 2
end

m = Model(Ipopt.Optimizer)
@variable(m, x[1:2]>=0)
JuMP.register(m, :f2, 2, f2, autodiff=true)
# NOTE intentional bug to produce the error, user puts `f` instead of `f2`
@NLobjective(m, Min, f(x[1],x[2]))
optimize!(m)

In Julia v1.7, JuMP v0.23.1 I get,

RROR: LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2})
...
Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2})
    @ Base ./number.jl:7
  [2] push!(a::Vector{Float64}, item::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2})
    @ Base ./array.jl:994
  [3] _parse_NL_expr_runtime(m::Model, x::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}, tape::Vector{JuMP._Derivatives.NodeData}, parent::Int64, values::Vector{Float64})
    @ JuMP ~/.julia/packages/JuMP/R2Knd/src/parse_nlp.jl:412
  [4] macro expansion
    @ ~/.julia/packages/JuMP/R2Knd/src/parse_nlp.jl:544 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/JuMP/R2Knd/src/macros.jl:2057 [inlined]
  [6] f(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}, ::Vararg{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}})
...
odow commented 2 years ago

This has nothing to do with f2. Your first example will not work because f(x...) is not differentiable.

julia> ForwardDiff.gradient(f, [0.0, 0.0])
ERROR: Unexpected array ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}[Dual{ForwardDiff.Tag{typeof(f), Float64}}(0.0,1.0,0.0), Dual{ForwardDiff.Tag{typeof(f), Float64}}(0.0,0.0,1.0)] in nonlinear expression. Nonlinear expressions may contain only scalar expressions.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] _parse_NL_expr_runtime(m::Model, x::Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}}, tape::Vector{JuMP._Derivatives.NodeData}, parent::Int64, values::Vector{Float64})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/parse_nlp.jl:457
  [3] macro expansion
    @ ~/.julia/packages/JuMP/Hc1qn/src/parse_nlp.jl:544 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/JuMP/Hc1qn/src/macros.jl:2057 [inlined]
  [5] f(x::Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}})
    @ Main ./REPL[13]:4
  [6] vector_mode_dual_eval!(f::typeof(f), cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}}}, x::Vector{Float64})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/apiutils.jl:37
  [7] vector_mode_gradient(f::typeof(f), x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/gradient.jl:106
  [8] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}}}, ::Val{true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/gradient.jl:19
  [9] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 2}}}) (repeats 2 times)
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/gradient.jl:17
 [10] top-level scope
    @ REPL[24]:1
julia> using JuMP

julia> using Ipopt

julia> function f(x...)
           mf = Model(Ipopt.Optimizer)
           @variable(mf, y[1:2])
           @NLobjective(mf, Max, y[1]*x[1] + y[2]*x[2] - x[1]*y[1]^4 - 2*x[2]*y[2]^4)
           @constraint(mf, (y[1]-10)^2 + (y[2]-10)^2 <= 25)
           optimize!(mf)
           return objective_value(mf)
       end
f (generic function with 1 method)

julia> m = Model(Ipopt.Optimizer)
A JuMP Model
Feasibility problem with:
Variables: 0
Model mode: AUTOMATIC
CachingOptimizer state: EMPTY_OPTIMIZER
Solver name: Ipopt

julia> @variable(m, x[1:2] >= 0)
2-element Vector{VariableRef}:
 x[1]
 x[2]

julia> JuMP.register(m, :f, 2, f, autodiff=true)

julia> @NLobjective(m, Min, f(x[1], x[2]))

julia> optimize!(m)
This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        4
Number of nonzeros in Lagrangian Hessian.............:        4

Total number of variables............................:        2
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        1
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        1

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  0.0000000e+00 1.75e+02 9.99e-01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  0.0000000e+00 3.72e+01 9.45e-03  -1.0 4.42e+00    -  9.91e-01 1.00e+00h  1
   2  0.0000000e+00 0.00e+00 1.47e-03  -1.0 9.96e+00    -  1.00e+00 1.00e+00h  1
   3  0.0000000e+00 1.14e+03 8.53e-02  -1.0 3.62e+02    -  1.00e+00 1.00e+00f  1
   4  0.0000000e+00 1.79e+02 2.08e-02  -1.0 1.61e+02    -  1.00e+00 1.00e+00h  1
   5  0.0000000e+00 0.00e+00 3.47e-03  -1.0 7.06e+01    -  1.00e+00 1.00e+00h  1
   6  0.0000000e+00 0.00e+00 3.00e-03  -1.0 9.80e+01    -  1.00e+00 1.00e+00h  1
   7  0.0000000e+00 0.00e+00 6.27e-03  -1.0 1.07e+01    -  1.00e+00 1.00e+00h  1
   8  0.0000000e+00 0.00e+00 2.25e-03  -1.0 3.99e+00    -  1.00e+00 1.00e+00h  1
   9  0.0000000e+00 0.00e+00 4.68e-04  -1.0 4.38e+00    -  1.00e+00 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  0.0000000e+00 0.00e+00 7.98e-05  -1.7 3.24e-01    -  1.00e+00 1.00e+00h  1
  11  0.0000000e+00 0.00e+00 5.69e-07  -3.8 3.41e-04    -  1.00e+00 1.00e+00h  1
  12  0.0000000e+00 0.00e+00 6.88e-09  -5.7 5.78e-04    -  1.00e+00 1.00e+00h  1
  13  0.0000000e+00 0.00e+00 9.33e-12  -8.6 6.33e-05    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 13

                                   (scaled)                 (unscaled)
Objective...............:  -0.0000000000000000e+00    0.0000000000000000e+00
Dual infeasibility......:   9.3292630065773811e-12    9.3292630065773811e-12
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.5050826949617035e-09    2.5050826949617035e-09
Overall NLP error.......:   2.5050826949617035e-09    2.5050826949617035e-09

Number of objective function evaluations             = 14
Number of objective gradient evaluations             = 14
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 14
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 14
Number of Lagrangian Hessian evaluations             = 13
Total seconds in IPOPT                               = 0.007

EXIT: Optimal Solution Found.
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:760
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
  ...
Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2})
    @ Base ./number.jl:7
  [2] push!(a::Vector{Float64}, item::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2})
    @ Base ./array.jl:928
  [3] _parse_NL_expr_runtime(m::Model, x::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}, tape::Vector{JuMP._Derivatives.NodeData}, parent::Int64, values::Vector{Float64})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/parse_nlp.jl:412
  [4] macro expansion
    @ ~/.julia/packages/JuMP/Hc1qn/src/parse_nlp.jl:544 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/JuMP/Hc1qn/src/macros.jl:2057 [inlined]
  [6] f(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}, ::Vararg{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}, N} where N)
    @ Main ./REPL[13]:4
  [7] (::JuMP.var"#132#134"{typeof(f)})(x::Vector{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/nlp.jl:1869
  [8] vector_mode_dual_eval!(f::JuMP.var"#132#134"{typeof(f)}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}}}, x::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/apiutils.jl:37
  [9] vector_mode_gradient!(result::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, f::JuMP.var"#132#134"{typeof(f)}, x::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/gradient.jl:113
 [10] gradient!(result::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, f::JuMP.var"#132#134"{typeof(f)}, x::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}}}, ::Val{true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/gradient.jl:37
 [11] gradient!(result::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, f::JuMP.var"#132#134"{typeof(f)}, x::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PBzup/src/gradient.jl:35
 [12] (::JuMP.var"#133#135"{ForwardDiff.GradientConfig{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#132#134"{typeof(f)}, Float64}, Float64, 2}}}, JuMP.var"#132#134"{typeof(f)}})(out::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, y::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/nlp.jl:1871
 [13] eval_objective_gradient(d::JuMP._UserFunctionEvaluator, grad::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, x::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/nlp.jl:1860
 [14] forward_eval(storage::Vector{Float64}, partials_storage::Vector{Float64}, nd::Vector{JuMP._Derivatives.NodeData}, adj::SparseArrays.SparseMatrixCSC{Bool, Int64}, const_values::Vector{Float64}, parameter_values::Vector{Float64}, x_values::Vector{Float64}, subexpression_values::Vector{Float64}, user_input_buffer::Vector{Float64}, user_output_buffer::Vector{Float64}, user_operators::JuMP._Derivatives.UserOperatorRegistry)
    @ JuMP._Derivatives ~/.julia/packages/JuMP/Hc1qn/src/_Derivatives/forward.jl:181
 [15] _forward_eval_all(d::NLPEvaluator, x::Vector{Float64})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/nlp.jl:773
 [16] macro expansion
    @ ~/.julia/packages/JuMP/Hc1qn/src/nlp.jl:842 [inlined]
 [17] macro expansion
    @ ./timing.jl:287 [inlined]
 [18] eval_objective_gradient(d::NLPEvaluator, g::Vector{Float64}, x::Vector{Float64})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/nlp.jl:840
 [19] _eval_objective_gradient(model::Ipopt.Optimizer, grad::Vector{Float64}, x::Vector{Float64})
    @ Ipopt ~/.julia/packages/Ipopt/M2QE8/src/MOI_wrapper.jl:865
 [20] (::Ipopt.var"#eval_grad_f_cb#4"{Ipopt.Optimizer})(x::Vector{Float64}, grad_f::Vector{Float64})
    @ Ipopt ~/.julia/packages/Ipopt/M2QE8/src/MOI_wrapper.jl:1087
 [21] _Eval_Grad_F_CB(n::Int32, x_ptr::Ptr{Float64}, #unused#::Int32, grad_f::Ptr{Float64}, user_data::Ptr{Nothing})
    @ Ipopt ~/.julia/packages/Ipopt/M2QE8/src/C_wrapper.jl:49
 [22] IpoptSolve(prob::IpoptProblem)
    @ Ipopt ~/.julia/packages/Ipopt/M2QE8/src/C_wrapper.jl:433
 [23] optimize!(model::Ipopt.Optimizer)
    @ Ipopt ~/.julia/packages/Ipopt/M2QE8/src/MOI_wrapper.jl:1225
 [24] optimize!
    @ ~/.julia/packages/MathOptInterface/FHFUH/src/Bridges/bridge_optimizer.jl:348 [inlined]
 [25] optimize!
    @ ~/.julia/packages/MathOptInterface/FHFUH/src/MathOptInterface.jl:81 [inlined]
 [26] optimize!(m::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}})
    @ MathOptInterface.Utilities ~/.julia/packages/MathOptInterface/FHFUH/src/Utilities/cachingoptimizer.jl:313
 [27] optimize!(model::Model; ignore_optimize_hook::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/optimizer_interface.jl:161
 [28] optimize!(model::Model)
    @ JuMP ~/.julia/packages/JuMP/Hc1qn/src/optimizer_interface.jl:143
 [29] top-level scope
    @ REPL[18]:1
ccoffrin commented 2 years ago

You are right, below is a more compact example.

As I understand it, JuMP tries to auto register the function f and uses autodiff, but in this case this function requires an explicit gradient, so the autodiff breaks. What the user needs to do is explicitly call JuMP.register(m, :f, 2, f, f_grad).

I see now that on first invocation I am getting a warning,

┌ Warning: Function f automatically registered with 2 arguments.
...

In my previous workflow I missed this because I was developing the code interactively in the REPL and I must have missed the first invocation. This warning seems like an ok but not great feedback for the user. If feasible, I think the problem communication could be improved by

using JuMP
using Ipopt

function f(x...)
    mf = Model(Ipopt.Optimizer)

    @variable(mf, y[1:2])
    @NLobjective(mf, Max, y[1]*x[1] + y[2]*x[2] - x[1]*y[1]^4 - 2*x[2]*y[2]^4)
    @constraint(mf, (y[1]-10)^2 + (y[2]-10)^2 <= 25)
    optimize!(mf)

    return objective_value(mf)
end

function f_grad(g, x...)
    mf = Model(Ipopt.Optimizer)

    @variable(mf, y[1:2] >= 0)
    @NLobjective(mf, Max, y[1]*x[1] + y[2]*x[2] - x[1]*y[1]^4 - 2*x[2]*y[2]^4)
    @constraint(mf, (y[1]-10)^2 + (y[2]-10)^2 <= 25)
    optimize!(mf)

    g[1] = value(y[1]) - value(y[1])^4
    g[2] = value(y[2]) - 2*value(y[2])^4
end

m = Model(Ipopt.Optimizer)
@variable(m, x[1:2]>=0)
#JuMP.register(m, :f, 2, f, f_grad) # this works
@NLobjective(m, Min, f(x[1],x[2]))
optimize!(m)
odow commented 2 years ago

If feasible, I think the problem communication could be improved by

  • If it is possible to do some check at time of automatic registration to see if autodiff surely will not work on this function, and error if that is the case.

This is easy to do. It should catch most errors, but because we don't know the domain, there might be some false negatives: https://github.com/jump-dev/JuMP.jl/pull/2911

  • When autodiff fails during optimize! generate a message like, "JuMP autodiff failed..."

This is a bit harder. Do we want to try-catch every invocation of ForwardDiff.derivative? I guess we'd need to profile to see if it was expensive.

ccoffrin commented 2 years ago

Looks great. Thanks!