robertfeldt / BlackBoxOptim.jl

Black-box optimization for Julia
Other
439 stars 56 forks source link

Example: regression via optimization #62

Closed finmod closed 7 years ago

finmod commented 7 years ago

This example produces an error in the syntax of bboptimize:

using BlackBoxOptim

ols_bestfit, ols_error = bboptimize((betas) -> ols_regression_objective(betas', x1, y1),(-10.0, 10.0); dimensions = 4, iterations = 2e4)

MethodError: no method matching bboptimize(::##7#8, ::Tuple{Float64,Float64}; dimensions=4, iterations=20000.0) Closest candidates are: bboptimize(::Any) at C:\Users\Denis.julia\v0.5\BlackBoxOptim\src\bboptimize.jl:73 got unsupported keyword arguments "dimensions", "iterations" bboptimize(::Any, ::Associative{Symbol,Any}; kwargs...) at C:\Users\Denis.julia\v0.5\BlackBoxOptim\src\bboptimize.jl:73

More generally, would you have an example for the parameter estimation of a Lorenz system with BBO implementing DE?

finmod commented 7 years ago

This may be linked to bumping REQUIRE up to Julia v0.5. The betas' syntax is deprecated. Also the syntax for bboptimize is not the same as in the README tutorial. @robertfeldt

robertfeldt commented 7 years ago

Yes, I will take a look at this soon and fix; dev and maintenance has been a bit slow but will pick up. Thanks for your interest.

robertfeldt commented 7 years ago

Sorry this took longer than expected but the first part of this example is now fixed and runs fine on latest master.

finmod commented 7 years ago

Thanks. I was going to ask you how "soon" is soon in Sweden!

robertfeldt commented 7 years ago

Sorry about the delay, just had a lot of other projects the last few months. Back now and will work more on this package. Any ideas, please share.

I fixed also 2nd set of examples of regression now.

finmod commented 7 years ago

The "regression via optimization" example is important in many respects for the generality of the approach and the clarity in setting up the NL problem. It is now running fine and demonstrates elementary strengths of BlackBoxOptim. Let me suggest two directions of enhancements that are badly needed in Julia:

robertfeldt commented 7 years ago

If you can point to example code for any/some of these problems/classes it would be nice to investigate them. Is any of the code listed here https://en.wikipedia.org/wiki/Lorenz_system relevant for the Lorenz system you refer to?

robertfeldt commented 7 years ago

I implemented the non-linear regression example of a Michaelis-Menten model as used in the README for NLReg.jl: https://github.com/robertfeldt/BlackBoxOptim.jl/blob/master/examples/nonlinear_regression_michaelis_menten.jl

finmod commented 7 years ago

Yes, this is the one. The Python code here is close to the proper formulation in Julia. Here is a formulation in Julia https://github.com/pjpmarques/Julia-Modeling-the-World/blob/master/Lorenz%20Attractor.ipynb

I have formulated it myself as: using DifferentialEquations, RecursiveArrayTools

g1 = @ode_def_nohes LorenzExample begin dx = σ(y-x) dy = x(ρ-z) - y dz = xy - βz end σ=>10.0 ρ=28.0 β=(8/3)

u0 = [-11.8;-5.1;37.5] tspan = (0.0,30.0) prob = ODEProblem(g1,u0,tspan)

sol = solve(prob,Tsit5()) t = collect(linspace(0,30,500)) randomized = [(sol(t[i]) + .01randn(3)) for i in 1:length(t)] data = vecvec_to_mat(randomized)

println(data[1:10,1:3])

display(data[1:50,1:3])

robertfeldt commented 7 years ago

Ok, if we can just define some relevant cost/objective function it would be interesting to try it. But if there are better/more specific ways to optimize ODE's why use a black-box optimization method? ;)

The solution is to find the 3D point vectors so that their finite difference minimizes the given derivatives at the given times (in t)?

finmod commented 7 years ago

For now, the answer is that there is no way to estimate these three parameters by the usual algorithms i.e. the 40 or so algos of NLopt. But then NLopt does not implement the breadth of BlackBoxOptim algorithms. Then, there is this paper: https://www.hindawi.com/journals/ddns/2015/740721/ implementing some algorithms of BlackBoxOptim but it is not in Julia and I cannot assess how far from the known solutions these authors are starting from i.e. how global these AIDE algorithms are. Finally, there is the issue of NLS versus log-likelihood objective function to simplify the search for the solution.

robertfeldt commented 7 years ago

Ok, let me think about a reasonable way to try it and get back to you. It might not be best to search for all the params for all the points in one search since that can get maybe too high-dimensional so I foresee some kind of batching and then some constraints between the batched solutions. Let me check the papers.

robertfeldt commented 7 years ago

But wait, in the paper you link to they do not search for the points, only for three parameters. So this is a low-dimensional problem and shouldn't be a problem. Did I misunderstand something?

finmod commented 7 years ago

Correct. It is a problem of fitting a nonlinear differential equation model to a dataset and therefore finding the parameters providing the best fit. This is the primary ambition. But it is also true that there is uncertainty quantification in the solver used to produce the estimated data points, particularly for sensitive/chaotic IVP.

I hope this is a clarification to understanding the problem to be solved.

robertfeldt commented 7 years ago

Ok, sorry for my relative ignorance about ODEs; it was many years ago I worked with them.

I have not tuned this at all, and I'm sure it can be done much smarter but here is an example that seems to solve the Lorentz attractor example quite consistently with ~4% subsample of the initial time steps from your example code:

https://github.com/robertfeldt/BlackBoxOptim.jl/blob/master/examples/ode_lorentz_attractor.jl

robertfeldt commented 7 years ago

Actually, when I added another optimization run using the same setting as in the Xiang2015 paper you linked to it can solve this in less than 2 seconds (since they only use 300 time samples).

robertfeldt commented 7 years ago

And it seems to find the parameters even with looser bounds than the ones used in the Xiang2015 paper. Added also that run as a third one in the example.

finmod commented 7 years ago

Outstanding! I just set up the ipynb and ran it. The parameter estimates are spot on right away. Let me look at all of this in detail. Loosening the bounds is key to proving the usefulness of BlackBoxOptim. Let me see how far it can go.

finmod commented 7 years ago

Robert, sorry I am French and my attention has been diverted by the election results during the last 24 hours.

The parameter estimates coming out of bboptimize are really strong and unequivocal. I have tested broadening the bounds and the algorithm zeros in to the known values equally well. For a first shot at the problem, this is outstanding and I appreciate the beauty of Julia in setting the problem almost as on paper.

However, the challenges are to relax a number of assumptions and to improve further computer efficiency now the accuracy is confirmed. The first consideration is whether the lorentz_fitness function should be a generalized least-square function or a negatine log likelihood function as mentioned in Seber and Wild reference above. The likelihood approach is a monotonic transformation that may save considerable computer time in the initial iterations/evaluations. There are also desirable statistical properties to study the inverse problem and links to the Fisher information matrix. The proof of the pudding is in measuring the computer efficiency of each class.

A second consideration would be to add noise to the data and see how far the algorithm is capable of estimating the parameters as the signal to noise ratio changes. I guess this is a simple modification of calc_state_vectors. The Xiand2015 paper does after the exact estimation.

A third consideration is that bboptimize is for now stopped by MaxSteps. But it would be good to stop with Fitness reaching single precision zero at least initially when one is attacking a new problem. I could not find the right doc in the examples for setting Fitness.

A fourth consideration is relaxing the sample size for estimation. As I understand it, at present there are 100,000 steps and therefore data points and the first 4,000 are used for estimation. From the various demos, the usual dataset set is from time 0 to 30 with about 300 data points in between.

A fifth consideration is: which algorithim is used right now and should others be tried within the available menu in BlackBoxOptim.

Finally, let me print my output for relaxed bounds: BroadBounds = Tuple{Float64, Float64}[(5, 15), (10, 40), (2, 3)]

Starting optimization with optimizer BlackBoxOptim.DiffEvoOpt{BlackBoxOptim.FitPopulation{Float64},BlackBoxOptim.RadiusLimitedSelector,BlackBoxOptim.AdaptiveDiffEvoRandBin{3},BlackBoxOptim.RandomBound{BlackBoxOptim.RangePerDimSearchSpace}} 0.00 secs, 0 evals, 0 steps 0.50 secs, 114 evals, 65 steps, improv/step: 0.415 (last = 0.4154), fitness=521.247284796 1.00 secs, 315 evals, 221 steps, improv/step: 0.394 (last = 0.3846), fitness=521.247284796 1.50 secs, 555 evals, 450 steps, improv/step: 0.347 (last = 0.3013), fitness=521.247284796 2.01 secs, 797 evals, 691 steps, improv/step: 0.337 (last = 0.3195), fitness=521.247284796 2.51 secs, 1037 evals, 931 steps, improv/step: 0.324 (last = 0.2875), fitness=57.338333111 3.01 secs, 1277 evals, 1171 steps, improv/step: 0.304 (last = 0.2250), fitness=57.338333111 3.51 secs, 1492 evals, 1386 steps, improv/step: 0.294 (last = 0.2419), fitness=50.760080162 4.01 secs, 1728 evals, 1622 steps, improv/step: 0.297 (last = 0.3136), fitness=9.925905074 4.51 secs, 1950 evals, 1844 steps, improv/step: 0.302 (last = 0.3378), fitness=8.139046515 5.01 secs, 2193 evals, 2087 steps, improv/step: 0.296 (last = 0.2469), fitness=2.421857092 5.52 secs, 2432 evals, 2326 steps, improv/step: 0.294 (last = 0.2845), fitness=0.358281671 6.02 secs, 2671 evals, 2565 steps, improv/step: 0.295 (last = 0.2971), fitness=0.122052622 6.52 secs, 2905 evals, 2799 steps, improv/step: 0.292 (last = 0.2607), fitness=0.122052622 7.02 secs, 3140 evals, 3035 steps, improv/step: 0.293 (last = 0.3051), fitness=0.107946897 7.52 secs, 3348 evals, 3243 steps, improv/step: 0.288 (last = 0.2163), fitness=0.023327968 8.02 secs, 3584 evals, 3479 steps, improv/step: 0.289 (last = 0.2966), fitness=0.003160878 8.53 secs, 3801 evals, 3696 steps, improv/step: 0.288 (last = 0.2811), fitness=0.000341386 9.03 secs, 4035 evals, 3930 steps, improv/step: 0.285 (last = 0.2350), fitness=0.000341386 9.53 secs, 4254 evals, 4149 steps, improv/step: 0.283 (last = 0.2466), fitness=0.000341386 10.03 secs, 4487 evals, 4383 steps, improv/step: 0.282 (last = 0.2650), fitness=0.000067280 10.53 secs, 4722 evals, 4618 steps, improv/step: 0.285 (last = 0.3404), fitness=0.000067162 11.03 secs, 4954 evals, 4850 steps, improv/step: 0.286 (last = 0.3060), fitness=0.000013032 11.53 secs, 5189 evals, 5085 steps, improv/step: 0.286 (last = 0.2766), fitness=0.000003930 12.04 secs, 5423 evals, 5319 steps, improv/step: 0.289 (last = 0.3547), fitness=0.000001388 12.54 secs, 5657 evals, 5553 steps, improv/step: 0.288 (last = 0.2778), fitness=0.000000303 13.04 secs, 5889 evals, 5785 steps, improv/step: 0.286 (last = 0.2284), fitness=0.000000258 13.54 secs, 6122 evals, 6018 steps, improv/step: 0.285 (last = 0.2661), fitness=0.000000050 14.04 secs, 6353 evals, 6249 steps, improv/step: 0.285 (last = 0.2814), fitness=0.000000017 14.54 secs, 6584 evals, 6480 steps, improv/step: 0.284 (last = 0.2641), fitness=0.000000015 15.04 secs, 6815 evals, 6711 steps, improv/step: 0.284 (last = 0.2771), fitness=0.000000004 15.54 secs, 7050 evals, 6946 steps, improv/step: 0.283 (last = 0.2596), fitness=0.000000004 16.05 secs, 7247 evals, 7144 steps, improv/step: 0.282 (last = 0.2576), fitness=0.000000001 16.55 secs, 7468 evals, 7365 steps, improv/step: 0.281 (last = 0.2308), fitness=0.000000000 17.05 secs, 7698 evals, 7595 steps, improv/step: 0.282 (last = 0.3043), fitness=0.000000000 17.55 secs, 7928 evals, 7825 steps, improv/step: 0.282 (last = 0.2870), fitness=0.000000000

Optimization stopped after 8001 steps and 17.944999933242798 seconds Termination reason: Max number of steps (8000) reached Steps per second = 445.86235886121614 Function evals per second = 451.60211926150424 Improvements/step = 0.2825 Total function evaluations = 8104

Best candidate found: [10.0,28.0,2.66667]

Fitness: 0.000000000

robertfeldt commented 7 years ago

These considerations are a bit too detailed for me currently when I'm not myself doing any ODE work. I'd be happy to help out if you have specific concerns but I was hoping that the code I provided could help you kickstart your own experimentation.

Going for only 300 samples is something I did try, if you check the latest update to the example file. The run producing res2 as a result is using only the 300 samples of the Xiang2015 paper and, of course, efficiency is much higher, solving the problem in less than 2 seconds.

Adding noise would be a matter of changing calc_state_vectors or lorentz_fitness, yes.

Adding stopping by different tolerance values can be done with other parameters than using MaxSteps but the options are not yet as complete as for other optimization frameworks and I will update this when doing a MathProgBase interface. So please stay tuned for this one.

You can change the algorithm used by using the Method option to bbsetup or bboptimize. I tried also dxnes and generating_set_search. The former is not a good choice here and the latter solves it but less efficiently than the default DE choice. If I were you I would stick with the default choice for now.

Thanks and hope this gives some further ideas.

finmod commented 7 years ago

You are correct that I can continue with the investigation of the estimator properties of the ODE problem. In any case, I am so grateful to you for making BlackBoxOptim clearer to use and sorting out the examples.

The latest version of the ode_lorenz_attractor example is perfect for now. I had not seen that it had been updated to answer most of the points I raised. Perhaps a delay in pushing the latest commit.

I am glad to read that you are planning to deal with the MathProgBase interface issue that was raised in December and I will just be happy to watch progress as it unfolds. Anyway, I take your word that the default DE is the best algorithm.

Still at less than 1.88 seconds for the Xiang2015 short sample, this is excellent as there is no NLopt algorithm that could provide accuracy so far for this challenging test.

Thank you again for making BlackBoxOptim available.