SciML / Optimization.jl

Mathematical Optimization in Julia. Local, global, gradient-based and derivative-free. Linear, Quadratic, Convex, Mixed-Integer, and Nonlinear Optimization in one simple, fast, and differentiable interface.
https://docs.sciml.ai/Optimization/stable/
MIT License
704 stars 77 forks source link

Clarify Callback function arguments #613

Closed jrwrigh closed 5 months ago

jrwrigh commented 10 months ago

In the documentation for the solver callback function, it simply states

The callback function callback is a function which is called after every optimizer step. Its signature is:

callback = (params, loss_val, other_args) -> false

where params and loss_val are the current parameters and loss/objective value in the optimization loop...

This would leave me to believe that loss_val is the objective function value given the params. But in my testing, this is not true. Are the params the parameters for the lowest-found objective value? If so, what is loss_val then?

ChrisRackauckas commented 10 months ago

This would leave me to believe that loss_val is the objective function value given the params. But in my testing, this is not true.

MWE?

jrwrigh commented 10 months ago

MWE:

#!/usr/bin/env julia
using Optimization
using OptimizationOptimJL
using OptimizationBBO

function Objective(θ)
    println(θ)
    θ[1]*π + θ[2]*π^2
end

function callback(θ,l)
    println("loss = ", l, "; Objective = ", Objective(θ), "; θ = ", θ)
    false
end

f = OptimizationFunction((x,p) -> Objective(x))
prob = OptimizationProblem(f, [1.,1.], lb = [0, 0], ub = [5,5])

sol = solve(prob, SAMIN(), maxiters = 100, callback=callback)

Output:

[1.0, 1.0]
[1.0, 1.0]
loss = 13.011197054679151; Objective = 13.011197054679151; θ = [1.0, 1.0]
[2.2784462189095045, 1.0]
[1.0, 1.0]
loss = 17.0275543040149; Objective = 13.011197054679151; θ = [1.0, 1.0]
[1.0, 3.751740746152813]
[1.0, 1.0]
loss = 40.16978963356587; Objective = 13.011197054679151; θ = [1.0, 1.0]
[2.1906161983023527, 1.0]
[1.0, 1.0]
loss = 16.75162815651083; Objective = 13.011197054679151; θ = [1.0, 1.0]
[1.0, 4.579716948495393]
[1.0, 1.0]
loss = 48.34158720420345; Objective = 13.011197054679151; θ = [1.0, 1.0]
[1.4085051716044383, 1.0]
[1.0, 1.0]
loss = 14.294553900745093; Objective = 13.011197054679151; θ = [1.0, 1.0]
[1.0, 4.439272547677455]
[1.0, 1.0]
loss = 46.955456527782374; Objective = 13.011197054679151; θ = [1.0, 1.0]

Replacing SAMIN with BBO_separable_nes gets:

[1.2651700742309395, 1.2463652825430172]
[1.5008910821665165, 2.4346063998687617]
[2.516042583093009, 0.5169370005959889]
[0.7157374976061156, 1.2805306570014245]
[2.554276559242327, 1.754265674067271]
[1.0445884181486234, 0.7457637540986819]
[0.4728388355825608, 1.0881077840912106]
[1.0445884181486234, 0.7457637540986819]
loss = 12.224660386924326; Objective = 10.642064530105971; θ = [1.0445884181486234, 0.7457637540986819]
[0.2389702931322738, 1.3161164470857385]
[1.9943127215630503, 0.1634874215712977]
[0.5507152315097662, 0.02462091914764314]
[0.18497224151186725, 0.8589255834650602]
[0.019544282046384476, 0.2986333892270183]
[1.9386017622158398, 1.3100524382191914]
[0.5507152315097662, 0.02462091914764314]
loss = 19.019996364319486; Objective = 1.9731216575095276; θ = [0.5507152315097662, 0.02462091914764314]
[0.30902757157271116, 0.1862968944410031]
[0.03692333888759047, 2.2424448596829447]
[0.7942026336811401, 0.2592297736797502]
[1.0491501766340732, 0.2508560884087751]
[0.9837239048053563, 0.17551492796809098]
[0.9222591707626571, 0.4219845506232969]
[0.5507152315097662, 0.02462091914764314]
loss = 7.062183213597184; Objective = 1.9731216575095276; θ = [0.5507152315097662, 0.02462091914764314]

So $\theta$ reported by callback is not reflected by the loss value. And the loss value (maybe?) corresponds to one of the objective function calls made between iterations.

ChrisRackauckas commented 10 months ago

This looks like it may have something to do with Optim's implementation. SAMIN uses a population of points. It may be giving the parameters of the minimum or some random one from the population. But since it's just what Optim gives, it's more an issue there.

Vaibhavdixit02 commented 10 months ago

It becomes clearer what's going on if you remove the printing inside the Objective function @jrwrigh.

loss = 13.011197054679151; Objective = 13.011197054679151; θ = [1.0, 1.0]
loss = 14.863244115439318; Objective = 13.011197054679151; θ = [1.0, 1.0]
loss = 5.06096236425929; Objective = 5.06096236425929; θ = [1.0, 0.19447281093228486]
loss = 12.08223419234228; Objective = 5.06096236425929; θ = [1.0, 0.19447281093228486]
loss = 12.280986917861902; Objective = 5.06096236425929; θ = [1.0, 0.19447281093228486]
loss = 12.02694966323369; Objective = 5.06096236425929; θ = [1.0, 0.19447281093228486]
loss = 49.87724644467404; Objective = 5.06096236425929; θ = [1.0, 0.19447281093228486]
loss = 11.318289145719827; Objective = 5.06096236425929; θ = [1.0, 0.19447281093228486]
loss = 3.8371618663221203; Objective = 3.8371618663221203; θ = [1.0, 0.07047589593920844]
loss = 5.18637113178798; Objective = 3.8371618663221203; θ = [1.0, 0.07047589593920844]
loss = 14.732543607314721; Objective = 3.8371618663221203; θ = [1.0, 0.07047589593920844]

So you see that the loss is changing more frequently than the Objective(θ) and always >= Objective(θ), so the Objective(θ) is the minimum loss so far. If you want to get into the weeds I can point out the parts you need to look at in Optim https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/multivariate/solvers/constrained/samin.jl#L99 https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/multivariate/solvers/constrained/samin.jl#L163

Something similar is happening in BBO also, it would be helpful to remember that these methods are typically of the form where a global minimum is maintained and search space explored. I could dig up the relevant parts of code in BBO if you need. Hope this helps.

Vaibhavdixit02 commented 5 months ago

The arguments are clarified in the solve docs - but I will move them into a separate sidebar tab that'll save a lot of trouble in general. Closing this thought since it's fixed