SciML / Optimization.jl

Mathematical Optimization in Julia. Local, global, gradient-based and derivative-free. Linear, Quadratic, Convex, Mixed-Integer, and Nonlinear Optimization in one simple, fast, and differentiable interface.
https://docs.sciml.ai/Optimization/stable/
MIT License
691 stars 75 forks source link

How to access Hessian at solution under constraints? #654

Closed fintzij closed 4 months ago

fintzij commented 6 months ago

Hello,

I am solving a constrained optimization problem and I am not sure how to go about accessing the Hessian at the solution. MWE below that illustrates the issue.

Thank you!! (and apologies if this was obvious or covered somewhere but I didn't see it)

-- Jon

using DiffResults, ForwardDiff, Optimization, OptimizationOptimJL

### Pilfered from the constraint example in the documentation
# objective function
rosenbrock(x, p) = (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
x0 = zeros(2)
_p = [1.0, 1.0]

# set equality constraint x[1]=x[2]
cons(res, x, p) = (res .= [x[1] - x[2]])
lc = [0.0,]
uc = [0.0,]

# set up problem
optprob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff(), cons = cons)
prob = OptimizationProblem(optprob, x0, _p, lcons = lc, ucons = uc)

# solve
sol = solve(prob, IPNewton())

# Compute Hessian in the usual way
diffres = DiffResults.HessianResult(sol.u)
obj = x -> rosenbrock(x, _p)
diffres = ForwardDiff.hessian!(diffres, obj, sol.u)

# but this isn't right
DiffResults.hessian(diffres) # should not be dense
Vaibhavdixit02 commented 6 months ago

I get

julia> DiffResults.hessian(diffres)
2×2 Matrix{Float64}:
 10.0  -4.0
 -4.0   2.0

which is correct, so I am not sure what issue you are referring to?

fintzij commented 6 months ago

I think there is a discrepancy between what I expected and how the software works. The equality constraint in the previous example implies that there is only one parameter, effectively. Hence, I expected the same pattern as what we would get if we substituted x1 for x2 and evaluated the second derivatives analytically, i.e.,

# like this
using Symbolics
@variables x1 x2
s = (1 - x1)^2 + (x1 - x1^2)^2
Symbolics.jacobian(Symbolics.gradient(s, [x1, x2]), [x1, x2]) # only nonzero in the 1,1 element

If we were to make the example even more pathological, we could constrain x1 = 0 and x2 = 0, exactly. Then, there are no parameters and all derivatives are zero since x1 and x2 are constant.

I think this highlights the discrepancy between what I was (stupidly) expecting and the practical considerations for constraints. Unless the objective function is declared symbolically, e.g., via ModelingToolkit, it is probably not possible for the software to know that the constraint has reduced the dimension of the parameter space. So, it makes sense to return the Hessian that was returned and deal with the constraints some other way.

One followup question: where can I look to understand how Optimization.jl is generating and using the constraints internally?

Vaibhavdixit02 commented 6 months ago

Unless the objective function is declared symbolically, e.g., via ModelingToolkit, it is probably not possible for the software to know that the constraint has reduced the dimension of the parameter space.

Right, if you do that then structural_simplify should give you this substitution a priori and the way you expected would work.

One followup question: where can I look to understand how Optimization.jl is generating and using the constraints internally?

There is no such representation at the Optimization.jl level, this depends on the optimizer. But most frequently the https://en.wikipedia.org/wiki/Lagrange_multiplier is used by solvers and we create the oracle (julia functions) for these quantities. For exact format of how this is done you'd be better off understanding the optimization solver you are using. It depends on what you are trying to achieve here though without which it is a little harder to help you.

fintzij commented 6 months ago

Thanks!

fintzij commented 6 months ago

@Vaibhavdixit02, is there a way to extract the oracle Julia functions that are generated by Optimization.jl to encode the constraints for downstream use? either the runtime generated functions or underlying code could work.

thanks and apologies for reopening a closed issue.

Vaibhavdixit02 commented 4 months ago

oracle Julia functions that are generated by Optimization.jl to encode the constraints for downstream use?

Yes they are stored in the respective fields of the OptimizationFunction generated after call instantiate_function (internal function). You can use that function create these yourself (you can take a look at the tests for how that's done) and if you implementing a solver using the interface you have to call that function before starting the solver and then you'd have the oracles populated for your use. Hope this helps!

fintzij commented 4 months ago

Excellent, thank you!