jbrea / BayesianOptimization.jl

Bayesian optimization for Julia
Other
91 stars 17 forks source link

Restarting optimization #7

Closed platawiec closed 5 years ago

platawiec commented 5 years ago

If I want to restart the optimization, I might do something like (assume the following has been set up with the code in the README:

SETUP

using BayesianOptimization, GaussianProcesses, Distributions

f(x) = sum((x .- 1).^2) + randn()

model = ElasticGPE(2,
                   mean = MeanConst(0.),         
                   kernel = SEArd([0., 0.], 5.),
                   logNoise = 0.,
                   capacity = 3000)
set_priors!(model.mean, [Normal(1, 2)])

modeloptimizer = MAPGPOptimizer(every = 50, noisebounds = [-4, 3],
                                kernbounds = [[-1, -1, 0], [4, 4, 10]],
                                maxeval = 40)
opt = BOpt(f,
           model,
           UpperConfidenceBound(),
           modeloptimizer,                        
           [-5., -5.], [5., 5.],
           repetitions = 5,
           maxiterations = 100,
           sense = Min,
           verbosity = Progress)

result = boptimize!(opt)

RESTART OPTIMIZATION

opt = BOpt(f,
           model,
           UpperConfidenceBound(),
           modeloptimizer,                        
           [-5., -5.], [5., 5.],
           lhs_iterations = 0,
           maxiterations = 10,
           sense = Min,
           verbosity = Progress)

result = boptimize!(opt)

This currently gives the following error:

MethodError: no method matching append!(::GPE{ElasticArrays.ElasticArray{Float64,2,1},ElasticArrays.ElasticArray{Float64,1,0},MeanConst,SEArd{Float64},ElasticPDMats.ElasticPDMat{Float64,Array{Float64,2}},GaussianProcesses.StationaryARDData{ElasticPDMats.AllElasticArray{Float64,3}}}, ::Array{Any,1}, ::Array{Float64,1})

If I set lhs_iterations=1, and run for a few cycles, it seems that the optimization forgets about the previous optima:

opt = BOpt(f,
           model,
           UpperConfidenceBound(),
           modeloptimizer,                        
           [-5., -5.], [5., 5.],
           lhs_iterations = 1,
           maxiterations = 10,
           sense = Min,
           verbosity = Progress)

result = boptimize!(opt)

which reports observed_optimum = -0.8571058809745942, compared to the optimum pre-restart of observed_optimum = -2.720197115023963, whose observed_optimizer is still in the model

Which leads to the following: is there a good way to restart optimizations currently (maybe I missed something in the source), and, if this is the best way to do so, can the above issues be fixed?

jbrea commented 5 years ago

This clearly needs better documantation; thanks for raising the issue.

To restart the optimization one does not need to define a new optimization object, i.e.

result = boptimize!(opt) # first time (includes lhs initialization)
result = boptimize!(opt) # restart
result = boptimize!(opt) # restart again

will run Bayesian optimization three times for maxiterations (or maximum time constraint). It should also be possible to set opt.iterations.N = 10 inbetween restarts.

platawiec commented 5 years ago

Makes sense, but then that begs the question: what if we already have a model? If we build a new BOpt from the model, then it seems like setting lhs_iterations=0, which may be a reasonable thing to do, will not work. Furthermore, the optimization process doesn't include the points the model has already seen. If my reading of the "progress" is correct, then this extends toward the initial lhs sampling, as it also doesn't see those iterations as possible minimizers.

jbrea commented 5 years ago

Good point! I'm pretty busy at the moment, but I intend to have a look at this in June. Locally I also experimented with replacing Latin Hypercube Sampling with Sobol sequences, so I may address these issues together.

platawiec commented 5 years ago

That sounds good. I frequently need to perform optimization on models I already have some observations for, so I may have a go at this shortly.

Have you considered using https://github.com/MrUrq/LatinHypercubeSampling.jl ? The sampling points are optimized under a distance metric, it would be interesting to compare against the current implementation.