add optional bounds - Githubissues

jbrea commented 5 years ago

This is a proposition to have optional bounds for the hyperparameter optimisation. If you like the idea, I will also adapt GPMC.

chris-nemeth commented 5 years ago

Out of curiosity, is there anything like this in the Optim.jl package?

jbrea commented 5 years ago

Well, the code that I pushed runs (at least on my machine) :smile:

The reason why I started thinking about such constraints was that my implementation of Bayesian optimization was pretty unstable with unconstrained hyper-parameter optimization. Adding the option for constraints as in this PR helped to stabilize, but hyperparameter optimization was still pretty slow. This is why I switched to NLopt in the BayesianOptimization package, where I have now the function

function optimizemodel!(gp::GPBase; 
                        domean = true, kern = true, noise = true, lik = true,
                        meanbounds = nothing, kernbounds = nothing, 
                        noisebounds = nothing, method = :LD_LBFGS, 
                        maxeval = 500)
    params_kwargs = GP.get_params_kwargs(gp; domean=domean, kern=kern, 
                                         noise=noise, lik=lik)
    f = (x, g) -> begin
            GP.set_params!(gp, x; params_kwargs...)
            GP.update_target_and_dtarget!(gp; params_kwargs...)
            @. g = gp.dtarget
            gp.target
        end
    lb, ub = GP.bounds(gp, noisebounds, meanbounds, kernbounds;
                       domean = domean, kern = kern, noise = noise)
    opt = NLopt.Opt(method, length(lb))
    NLopt.lower_bounds!(opt, lb)
    NLopt.upper_bounds!(opt, ub)
    NLopt.maxeval!(opt, maxeval)
    NLopt.max_objective!(opt, f)
    f, x, ret = NLopt.optimize(opt, GP.get_params(gp; params_kwargs...))
end

This is far more efficient and I can set the bounds like in this PR.

Feel free to close this PR, if you prefer to keep the hyperparameter optimization as it was. I am happy with the solution I found for the BayesianOptimization package.

maximerischard commented 5 years ago

I do feel like this is something that should be handled by the optimization package rather than hacked into our package. I've also used NLopt for some of my applications, and I agree it sometimes give better results (either faster convergence or finds a better maximum), although sometimes it also fails or crashes with just some obscure error code. We've discussed providing an interface to NLopt before, but again decided that would be beyond our purview and add more complexity to the package.

By the way, you should be able to use the get_optim_target function to give you the target functions to pass to NLopt. Let us know if you think that interface could be improved.

chris-nemeth commented 5 years ago

I agree with @maximerischard that it would be tidier if this functionality were part of the Optim package as the GaussianProcesses package is really just passing the optimization to Optim.jl.

That being said, if box optimization is useful and not available through Optim.jl, then it might be worth implementing @jbrea's suggestion. @fairbrot - do you have any thoughts on this?

jbrea commented 5 years ago

Box constraint optimization is available through Optim.jl; sorry for not being clear about this in the last post. I use it here.

Btw. see #94 for why I thought box constraints could be useful.

Since I constrain the kernel parameters, I didn't see any need for the exception handling as you do it in get_optim_target, @maximerischard. The slow down due to this expection handling is small, however, compared to using Optim instead of NLopt, in my experience. But I can understand, if you don't want to depend on NLopt and it's really no big deal to reimplement the hyperparameter optimization in BayesianOptimization, where I anyway depend on NLopt for the acquisition.

chris-nemeth commented 5 years ago

I've had a closer look at this PR and it looks like it would be quite easy to integrate and doesn't change much of the existing code. Seeing as the box constrained optimization is being handled through Optim.jl I don't think there are any issues with including this functionality into the package. However, it might be a good idea to crate a notebook on optimization to highlight the different options available, including box optimization.

fairbrot commented 5 years ago

I'm also happy for this to be merged (once checks are passing). We should probably illustrate this new functionality in one of the Jupyter notebooks.

jbrea commented 5 years ago

Thanks for the feedback. I added a little comment in Regression.ipynb, but since I hardly ever use notebooks I am not so efficient in editing them.

The bounds can now also be applied for GPs of type GPMC. From my side this PR would be ready to be merged.

STOR-i / GaussianProcesses.jl

add optional bounds #95