Open oliversheridanmethven opened 7 years ago
Hi Oliver, thanks for pointing this out. Can you provide more details of your example so we can have a look to it?
Here is a MWE that replicated the error (I think the initial points are randomly picked, so perhaps try running it a few times...):
from GPyOpt.methods import BayesianOptimization
import numpy as np
def array_to_float_wrapper(f):
"""
Ensures functions produce an ndArray. For functions which take an
an array and produce floats, to make functions with take matrices
and return matrices.
:param f: Fuction, take a np.array and returns a float.
:return: Function, takes and retuns an np.ndarray.
"""
def g(x):
"""
:param x: ndArray, e.g. x = np.array([[1.2, 2]])
:return: ndArray, of f(i) evaluated row wise.
"""
return np.array([[f(i)] for i in x])
return g
def solver(f, bounds, maxfev, initial_design_numdata=2, verbosity=False, n=None):
"""
The solver for using GPyOpt.
:param f: Function, takes an n-Array and returns a float.
:param bounds: List, [[lower, upper]*n], determines problem dimension.
:param maxfev: Int, max number of function evaluations.
:param initial_design_numdata: Int, how many points to sample at initialisation, >=2.
:param verbosity: Bool, verbose output?
:return: BayesianOptimisation, result from run_optimisation.
"""
assert initial_design_numdata >= 2, 'Please specify more initial starting points.'
if n is not None:
assert len(bounds) == n, 'The number of bounds does not correspond to the quoted dimension.'
else:
assert len(bounds) > 0, 'Please specify a non empty list of bounds.'
assert initial_design_numdata <= maxfev, 'The initialisation number of points is larger than the max number of function evaluations.'
g = array_to_float_wrapper(f)
y = BayesianOptimization(g, bounds=bounds, initial_design_numdata=initial_design_numdata)
y.run_optimization(max_iter=maxfev - initial_design_numdata, verbosity=verbosity)
return y
def osborne_11(x):
"""
The osborne-11 function. Mn = 0.040137736 at x = np.array([1.30997616, 0.59943034, 0.43155235, 0.63366122, 0.75418014, 4.82368638, 0.90429306, 1.36581389, 5.67534122, 2.39868579, 4.56887422])
:param x: 11-Array
:return: Float.
"""
assert len(x) == 11, 'The osborne_11 function requires an input of length 11.'
assert np.all((x >= 0.0) & (x <= 10.0)), 'The input must be in the range [0, 10].'
y = np.array(
[1.366, 1.191, 1.112, 1.013, 0.991, 0.885, 0.831, 0.847, 0.786, 0.725, 0.746, 0.679, 0.608, 0.655, 0.616, 0.606, 0.602, 0.626, 0.651, 0.724, 0.649, 0.649, 0.694, 0.644, 0.624, 0.661, 0.612, 0.558, 0.533, 0.495, 0.5, 0.423, 0.395, 0.375, 0.372, 0.391, 0.396, 0.405, 0.428, 0.429, 0.523, 0.562,
0.607, 0.653, 0.672, 0.708, 0.633, 0.668, 0.645, 0.632, 0.591, 0.559, 0.597, 0.625, 0.739, 0.71, 0.729, 0.72, 0.636, 0.581, 0.428, 0.292, 0.162, 0.098, 0.054])
M = 65
t = np.arange(0, M / 10.0, 0.1)
return np.sum((y - x[0] * np.exp(-t * x[4]) - x[1] * np.exp(-x[5] * (t - x[8]) ** 2.0) - x[2] * np.exp(-x[6] * (t - x[9]) ** 2.0) - x[3] * np.exp(-x[7] * (t - x[10]) ** 2.0)) ** 2.0)
bounds = [[0.0, 10.0]] * 11
solver(osborne_11, bounds, 100)
The error this gives back is:
Traceback (most recent call last):
File "/home/user/Documents/InFoMM/bayesian_optimisation/venv2/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-5f565b71df72>", line 17, in <module>
solver(osborne_11, bounds, 100, verbosity=False)
File "<ipython-input-2-f02a78eef8dc>", line 42, in solver
y.run_optimization(max_iter=maxfev - initial_design_numdata, verbosity=verbosity)
File "/home/user/Documents/InFoMM/bayesian_optimisation/venv2/local/lib/python2.7/site-packages/GPyOpt/methods/bayesian_optimization.py", line 458, in run_optimization
super(BayesianOptimization, self).run_optimization(max_iter = max_iter, max_time = max_time, eps = eps, verbosity=verbosity, save_models_parameters = save_models_parameters, report_file = report_file, evaluations_file= evaluations_file, models_file=models_file)
File "/home/user/Documents/InFoMM/bayesian_optimisation/venv2/local/lib/python2.7/site-packages/GPyOpt/core/bo.py", line 124, in run_optimization
self.evaluate_objective()
File "/home/user/Documents/InFoMM/bayesian_optimisation/venv2/local/lib/python2.7/site-packages/GPyOpt/core/bo.py", line 168, in evaluate_objective
self.Y_new, cost_new = self.objective.evaluate(self.suggested_sample)
File "/home/user/Documents/InFoMM/bayesian_optimisation/venv2/local/lib/python2.7/site-packages/GPyOpt/core/task/objective.py", line 51, in evaluate
f_evals, cost_evals = self._eval_func(x)
File "/home/user/Documents/InFoMM/bayesian_optimisation/venv2/local/lib/python2.7/site-packages/GPyOpt/core/task/objective.py", line 80, in _eval_func
rlt = self.func(np.atleast_2d(x[i]))
File "<ipython-input-2-f02a78eef8dc>", line 20, in g
return np.array([[f(i)] for i in x])
File "<ipython-input-3-5f565b71df72>", line 8, in osborne_11
assert np.all((x >= 0.0) & (x <= 10.0)), 'The input must be in the range [0, 10].'
AssertionError: The input must be in the range [0, 10].
Clearly this can be worked around a bit by my original suggestion, but this isn't fixing the problem. My guess is that when the acquisition function suggests a point to sample, this is not being subjected to the box constraints.
@javiergonzalezh I've noticed that sometimes indicator_constraints
is passed the (expanded) model x
and other times the optimizer x
. This is due to some code in anchor_points_generator.py. I suspect that sometimes constraints and bounds are being calculated on the zipped version, and other times the unzipped version.
@javiergonzalezh @oliversheridanmethven @apaleyes Hi, I was/am having a similar issue. Specifically, my initial samples generated by GPyOpt were always within the constraints, but sometimes the suggested point(s) would violate them.
As @sdrobert mentioned, the problem (for me) is that in base.py, indicator_constraints
is passed (or is operating) on the unzipped inputs.
However, in the initial design (for initialising the GP model or for generating anchor points), get_samples_with_constraints
is called from random_design.py which is passed the zipped inputs.
These are the frame stacks in my debugger: constraints.docx Maybe it will be useful?
To get round this for my purposes, I have edited acquisition_function
in base.py so that it reads like so:
`
def acquisition_function(self,x):
"""
Takes an acquisition and weights it so the domain and cost are taken into account.
"""
f_acqu = self._compute_acq(x)
cost_x, _ = self.cost_withGradients(x)
x_z = x if self.space.model_dimensionality == self.space.objective_dimensionality else self.space.zip_inputs(x)
# x_z = x
return -(f_acqu*self.space.indicator_constraints(x_z))/cost_x
`
This appears to work for me but I don't know how this affects all the different potential models, acquisition functions, and acquisition optimisers etc. I'm still just using the standard options: GP, EI, L-BFGS. It seems this is a problem primarily if one has categorical variables AND either they are part of the constraint(s) formulae or you only have constraints on other variables but some categorical variables are entered into the space list before them.
Hope this is helpful?!
EDIT: I guess this actually only ensures the starting points for the acquisition optimiser(s) are feasible but not the actual optimised points. I could imagine that encapsulating lines 63-73 in optimize
in acquisition_optimizer.py in a while
loop may give that functionality.
When running
BayesianOptimization.run_optimization
sometimes the suggested point (X_Batch
) can be outside of the feasible space off
as specified by bounds. This is raised inbo.py
in lineIf
f
has been defined to raise anAssertionError
in such a case the solver breaks. Wrapping this line in atry:... except: break
loop will terminate the process returning the solution until this point, but is only a temporary work around that doesn't address the issue.