EmuKit / emukit

A Python-based toolbox of various methods in decision making, uncertainty quantification and statistical emulation: multi-fidelity, experimental design, Bayesian optimisation, Bayesian quadrature, etc.
https://emukit.github.io/emukit/
Apache License 2.0
605 stars 128 forks source link

Compatibility issue between CONTEXT variable and parameter space CONSTRAINT #391

Open AlbanMor opened 3 years ago

AlbanMor commented 3 years ago

Hi there!

Python 3.8.8 Emukit 0.4.9

I've come across an issue when trying to get_next_points with a context variable for a constrained parameter space. My understanding is that the fact to use context variables passes an array of length n-x (for x context variable) to the constraint function. This function, which is expecting an array of length n, in turns causes it to bug.

Below is an exemple that will cause this issue (modifed from the "Emukit - Bayesian Optimization with Non-Linear Constraints" tutorial)

`FIG_SIZE = (12, 8)

from emukit.test_functions import branin_function fcn, space = branin_function()

import numpy as np constraint_radius = 4 constraint_fcn = lambda x: 10 * (-(x[0] - 3)2 - (x[1] - 7)2 + constraint_radius ** 2)

optimum = np.array([[-np.pi, 12.275], [np.pi, 2.275], [9.42478, 2.475]])

evaluate objective on grid to plot

x_1 = np.linspace(-5, 10, 50) x_2 = np.linspace(0, 15, 51) x_1_grid, x_2_grid = np.meshgrid(x_1, x_2) x_all = np.stack([x_1_grid.flatten(), x_2_grid.flatten()], axis=1) y_all = fcn(x_all) y_reshape = np.reshape(y_all, x_1_grid.shape)

evaluate constraint to plot

theta_constraint = np.linspace(0, 2np.pi) x_0_constraint = 3 + np.sin(theta_constraint) constraint_radius x_1_constraint = 7 + np.cos(theta_constraint) * constraint_radius

import matplotlib.pyplot as plt plt.figure(figsize=FIG_SIZE) plt.contourf(x_1, x_2, y_reshape) plt.title('Branin Function') plt.plot(x_0_constraint, x_1_constraint, linewidth=3, color='k') plt.plot(optimum[:, 0], optimum[:, 1], marker='x', color='r', linestyle='') plt.legend(['Constraint boundary', 'Unconstrained optima']);

import GPy from emukit.model_wrappers import GPyModelWrapper

x_init = np.array([[0, 7], [1, 9], [6, 8]]) y_init = fcn(x_init)

model = GPy.models.GPRegression(x_init, y_init) emukit_model = GPyModelWrapper(model)

from emukit.core.acquisition import Acquisition from emukit.core.constraints import NonlinearInequalityConstraint from scipy.special import expit # expit is scipy's sigmoid function

constraints = [NonlinearInequalityConstraint(constraint_fcn, 0, np.inf)] space.constraints = constraints

from emukit.bayesian_optimization.acquisitions import ExpectedImprovement ei = ExpectedImprovement(model)

from emukit.bayesian_optimization.loops import BayesianOptimizationLoop from emukit.core.optimization import GradientAcquisitionOptimizer

Create acquisition optimizer with constraints

acquisition_optimizer = GradientAcquisitionOptimizer(space)

Make BO loop

bo_loop = BayesianOptimizationLoop(space, emukit_model, ei, acquisition_optimizer=acquisition_optimizer)

append plot_progress function to iteration end event

bo_loop.get_next_points(results=None,context={'x2': 3})`

BTW: I love this package! Good work :-)

apaleyes commented 3 years ago

Hi @AlbanMor and thanks for a great bug report. I was able to easily reproduce it.

The core of the issue, as you correctly surmised, is that we apply context to the space, and do the constrained optimization over context free space: here. That puts a lot of implicit assumptions on the constraints about order and usage of variables.

To unblock yourself now you can design your constraints to refer to non-context variables only, and design your parameter space so that non-context variables always come before context ones.

But a proper fix isn't clear at all to me at the moment. Ideas welcome

ekalosak commented 3 years ago

Seems like we could just make the documentation more explicit about the assumptions made here? e.g. "we have a canonical ordering of variables in the parameter space" or whatever it shakes out to be.

AlbanMor commented 3 years ago

To unblock yourself now you can design your constraints to refer to non-context variables only, and design your parameter space so that non-context variables always come before context ones.

Thank you for the quick reply. A more agile approach on my end might be to take into account the specified context variable in my constraint function. How would i go about to call the context variable in my function?

apaleyes commented 3 years ago

@AlbanMor that's the thing, context variables aren't available inside constraints at the moment. So the best you can do is to have a global variable that you refer to. In the example:

X2_VALUE = 3
...
constraint_fcn = lambda x: 10 * (-(x[0] - 3)**2 - (X2_VALUE - 7)**2 + constraint_radius ** 2)
...
bo_loop.get_next_points(results=None,context={'x2': X2_VALUE})`

Admittedly, that's far from ideal.

apaleyes commented 3 years ago

@ekalosak that is a possibility, but I'd rather explore ways to solve it first. We just need to find a good way of passing full variables in constraints instead of a context-free ones. It is probably not too hard, and I think can be done by context manager.