Closed matbesancon closed 6 years ago
Is it not differentiable or do you just not have the gradient at hand?
Edit: Note that behind the scenes constrained optimization (not just box, but also box) is being worked on atm, though we're a bit constrained developer time-wise.
I don't have an explicit gradient, but it is differentiable.
Alright, then it shouldn't be problem. Let me find an example.
julia> using Optim
julia> function exponential(x)
return exp((2.0 - x[1])^2) + exp((3.0 - x[2])^2)
end
exponential (generic function with 1 method)
julia> function exponential_gradient!(storage, x)
storage[1] = -2.0 * (2.0 - x[1]) * exp((2.0 - x[1])^2)
storage[2] = -2.0 * (3.0 - x[2]) * exp((3.0 - x[2])^2)
end
exponential_gradient! (generic function with 1 method)
julia> initial_x = [0.0, 0.0]
2-element Array{Float64,1}:
0.0
0.0
julia> optimize(exponential, exponential_gradient!, initial_x, BFGS())
Results of Optimization Algorithm
* Algorithm: BFGS
* Starting Point: [0.0,0.0]
* Minimizer: [1.9999999999999587,2.999999999999897]
* Minimum: 2.000000e+00
* Iterations: 12
* Convergence: true
* |x - x'| ≤ 1.0e-32: false
|x - x'| = 1.04e-05
* |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
|f(x) - f(x')| = 6.19e-11 |f(x)|
* |g(x)| ≤ 1.0e-08: true
|g(x)| = 2.06e-13
* Stopped by an increasing objective: false
* Reached Maximum Number of Iterations: false
* Objective Calls: 49
* Gradient Calls: 49
julia> lb = fill(-0.1, 2)
2-element Array{Float64,1}:
-0.1
-0.1
julia> ub = fill(1.1, 2)
2-element Array{Float64,1}:
1.1
1.1
julia> od = OnceDifferentiable(exponential, initial_x)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(exponential, Optim.g!, Optim.fg!, 0.0, [6.93938e-310, 6.9394e-310], [NaN, NaN], [NaN, NaN], [0], [0])
julia> optimize(od, initial_x, lb, ub, Fminbox())
Results of Optimization Algorithm
* Algorithm: Fminbox with Conjugate Gradient
* Starting Point: [0.0,0.0]
* Minimizer: [1.099999999999446,1.099999999999984]
* Minimum: 3.921396e+01
* Iterations: 5
* Convergence: true
* |x - x'| ≤ 1.0e-32: false
|x - x'| = 5.53e-10
* |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: true
|f(x) - f(x')| = 1.13e-10 |f(x)|
* |g(x)| ≤ 1.0e-08: false
|g(x)| = 1.40e+02
* Stopped by an increasing objective: false
* Reached Maximum Number of Iterations: false
* Objective Calls: 283
* Gradient Calls: 213
which uses finite differences or if you want to use ForwardDiff
julia> od = OnceDifferentiable(exponential, initial_x; autodiff = :forward)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(exponential, Optim.#3, Optim.#4, 0.0, [8.32155e-317, 0.0], [NaN, NaN], [NaN, NaN], [0], [0])
julia> optimize(od, initial_x, lb, ub, Fminbox())
Results of Optimization Algorithm
* Algorithm: Fminbox with Conjugate Gradient
* Starting Point: [0.0,0.0]
* Minimizer: [1.0999999999994408,1.0999999999999839]
* Minimum: 3.921396e+01
* Iterations: 5
* Convergence: true
* |x - x'| ≤ 1.0e-32: false
|x - x'| = 5.53e-10
* |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: true
|f(x) - f(x')| = 1.13e-10 |f(x)|
* |g(x)| ≤ 1.0e-08: false
|g(x)| = 1.40e+02
* Stopped by an increasing objective: false
* Reached Maximum Number of Iterations: false
* Objective Calls: 279
* Gradient Calls: 208
However, let's transform this into a bug-report, as it is NOT intended that you cannot just pass the functions . The following should just have worked, but there is a construction of the OnceDifferentiable that is apparently not updated to the new format
julia> optimize(exponential, initial_x, lb, ub, Fminbox())
ERROR: Optimizing an objective `obj` without providing an initial `x` has been deprecated without backwards compatability. Please explicitly provide an `x`: `optimize(obj, x, l, u, method, options)``
Stacktrace:
[1] #optimize#110(::Array{Any,1}, ::Function, ::Function, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.Fminbox{Optim.ConjugateGradient}) at /home/pkm/.julia/v0.6/Optim/src/multivariate/solvers/constrained/fminbox.jl:106
[2] optimize(::Function, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.Fminbox{Optim.ConjugateGradient}) at /home/pkm/.julia/v0.6/Optim/src/multivariate/solvers/constrained/fminbox.jl:106
Thanks a lot! Indeed I assumed that OnceDifferentiable required the gradient to construct.
Yeah, the docs are in need of being rewritten
Also for non-differentiable functions check out the brand-new https://github.com/timholy/QuadDIRECT.jl. Started life as https://github.com/timholy/MultilevelCoordinateSearch.jl but I found MCS to be tough to implement, and in any event as I progressed I began to wonder if one might be able to do even better. I don't yet know whether QuadDIRECT is better, though---still needs lots of benchmarks.
Well, non-differentiable global optimizers are exactly what's needed for diffeq parameter estimation. @finmod would you like to give these a try in the Lorenz benchmark?
@ChrisRackauckas is non-differentiable the base case you would consider because of the integration of jumps in the diffeq models? In the case of ODEs, isn't a loss function for parameter estimation differentiable?
is non-differentiable the base case you would consider because of the integration of jumps in the diffeq models?
No. For those kinds of models you wouldn't optimize on a single solution, but on a Monte Carlo set of solutions, optimizing things like averages and distributions. For well-behaved jump equations those properties will be well-behaved as well.
In the case of ODEs, isn't a loss function for parameter estimation differentiable?
Yes, it's usually differentiable but the issue is that those derivatives require derivatives of the solution of the ODE with respect to parameters, which is expensive. The ways to do it are to solve the ODE N times for numerical/autodiff, or to use adjoint sensitivity analysis:
http://docs.juliadiffeq.org/latest/analysis/sensitivity.html#Adjoint-Sensitivity-Analysis-1
Global optimizers that utilize derivatives don't usually get all that much help from it since the derivatives are very local information, and since the derivatives are expensive that means that it can be a better idea to just use a derivative free method.
From the doc, the two cases for box minimization is either:
OnceDifferentiable
wrapper with provided gradientMaybe I just missed the feature but it would be awesome to have the possibility of box optimization on multi-dimensional functions without an explicit gradient. Thanks!