EmuKit / emukit

A Python-based toolbox of various methods in decision making, uncertainty quantification and statistical emulation: multi-fidelity, experimental design, Bayesian optimisation, Bayesian quadrature, etc.
https://emukit.github.io/emukit/
Apache License 2.0
605 stars 128 forks source link

Fixed sample of representer points for entropy search used throughout whole optimization #210

Closed henrymoss closed 5 years ago

henrymoss commented 5 years ago

Hi

The provided examples of entropy search and the multi-fidelity entropy search acquisitions currently only sample a single set of representer points, which they then use for the whole optimization.

This set should ideally be re-sampled at the beginning of each BO step (after updating the GP and proposal function), adaptively responding to our observations and focusing computational resources on promising areas.

Using the same set throughout requires a much larger choice of sample size to cover the search space and allow high-precision optimization.

My current workaround is to add a step that does this resampling at the end of each BO step.


def resampler(loop,loop_state):
    loop.candidate_point_calculator.acquisition.update_parameters()
loop.iteration_end_event.append(resampler)

I think this should be made clearer in the documentation or added automatically within the OuterLoop utility.

javiergonzalezh commented 5 years ago

Hi Henry,

Good catch! Do you mind doing a pull request with the change? We will review it and merge.

apaleyes commented 5 years ago

Hey @henrymoss ! Would you be able to answer Mark's question here? I'll copy it over:

I believe this problem originates from the fact that update_parameters is not called on the CostAcquisition. Are you using CostAcquisition or acquisition_per_expected_cost in your code for issue #210 ?

henrymoss commented 5 years ago

Sorry for the Delay!

I was using the Cost acquisition (as used in https://nbviewer.jupyter.org/github/amzn/emukit/blob/master/notebooks/Emukit-tutorial-multi-fidelity-bayesian-optimization.ipynb). I had just changed the acquisition function.

marpulli commented 5 years ago

If I run the following minimal example, the representer points do change after each iteration. Could you give me a minimal working example where you have found that they don't so I can debug the problem please?

from emukit.test_functions import forrester_function
from emukit.core.loop.user_function import UserFunctionWrapper
from emukit.model_wrappers import GPyModelWrapper
from emukit.bayesian_optimization.acquisitions import EntropySearch
from emukit.bayesian_optimization.loops.cost_sensitive_bayesian_optimization_loop import CostSensitiveBayesianOptimizationLoop
from emukit.core.acquisition import acquisition_per_expected_cost

import GPy
import numpy as np

# Get user function - cost is just constant 1
target_function, space = forrester_function()
user_func = UserFunctionWrapper(lambda x: (target_function(x), np.array([[1]])), extra_output_names=['cost'])

# Initial data points
X_init = np.array([[0.2],[0.6], [0.9]])
Y_init = target_function(X_init)
C_init = np.ones(Y_init.shape)

# Create cost and objective models
gpy_model = GPy.models.GPRegression(X_init, Y_init, GPy.kern.RBF(1, lengthscale=0.08, variance=20), noise_var=1e-10)
cost_model = GPy.models.GPRegression(X_init, C_init)
emukit_model = GPyModelWrapper(gpy_model)
cost_emukit_model = GPyModelWrapper(cost_model)

# Create acquisition per unit cost
es_acquisition = EntropySearch(emukit_model, space)
acquisition_per_cost = acquisition_per_expected_cost(es_acquisition, cost_emukit_model)

# Create loop
bo = CostSensitiveBayesianOptimizationLoop(space, model_objective=emukit_model, acquisition=acquisition_per_cost, 
                                           model_cost=cost_emukit_model)

# print representer points after each iteration
def print_repr(loop, loop_state):
    print(loop.candidate_point_calculator.acquisition.numerator.representer_points)

bo.iteration_end_event.append(print_repr)
bo.run_loop(user_func, 10)
henrymoss commented 5 years ago

Hi

I was following the jupyter notebook construction: https://nbviewer.jupyter.org/github/amzn/emukit/blob/master/notebooks/Emukit-tutorial-multi-fidelity-bayesian-optimization.ipynb

class Cost(Acquisition):
    def __init__(self, costs):
        self.costs = costs

    def evaluate(self, x):
        fidelity_index = x[:, -1].astype(int)
        x_cost = np.array([self.costs[i] for i in fidelity_index])
        return x_cost[:, None]

    @property
    def has_gradients(self):
        return True

    def evaluate_with_gradients(self, x):
        return self.evalute(x), np.zeros(x.shape)
cost_acquisition = Cost([low_fidelity_cost, high_fidelity_cost])
acquisition = MultiInformationSourceEntropySearch(model, parameter_space) / cost_acquisition
marpulli commented 5 years ago

Running the Emukit-tutorial-multi-fidelity-bayesian-optimization notebook I see the representer points change between iterations. I added the following line to the plot_acquisition function: print(loop.candidate_point_calculator.acquisition.numerator.representer_points).

That prints out the representer points after each iteration of the optimization, you can see them changing.

Could you let me know if you see different behaviour please?

henrymoss commented 5 years ago

I am unable to create the error now on the simple notebook example. I was trying to do something within a more complicated loop. I have since decided to do this task in a different way anyway and dont have this problem!

Sorry for the hassle!