pytorch / botorch

Bayesian optimization in PyTorch
https://botorch.org/
MIT License
3.08k stars 394 forks source link

Risk-averse Multiobjective Optimization Objective Function #2177

Closed Queimo closed 7 months ago

Queimo commented 8 months ago

Help wanted

I am trying to do risk-averse optimiziation of experiments in the Laboratory whose objectives are subject to some stochasticity. My setup allows for repeated experiments which I want to use to keep the algorithm from sampling very noisy regions. I have set-up a simple test function with noisy regions, with 2 inputs and 2 outputs. So roughly speaking: obj_vector = f(X) + eps(X) --> max. and avoid regions with high eps.

I want to do that using qLogNEHVI, as it already shows good performance if I fit my hetGP to Y.mean() and Y.std(). I want it to behave more risk-averse though. And I suspect the correct/better approach is not to just fit my GP to something like (Y_mean-beta*Y_var) directly, instead of defining an objective that transforms my outputs. My attempt was to combine the principles from tutorials/papers on Robust multi-objective Bayesian optimization under input noise and Risk averse Bayesian optimization with environmental variables.However, I am struggeling to adapt both methodolgies to my problem, as they target slightly different problems.

I looked into AppendFeatures, but it seems like it does not really fit, as my output is already expanded, once it comes out of my test function, and other types of input perturbations dont really make sense I think, because my inputs are modelled to be fixed.

Below is a minimal outline of the code. Any help in pointing me in the right direction is greatly appreciated!

Code example

import torch
from botorch.optim.optimize import optimize_acqf
from botorch.acquisition.multi_objective.logei import (
    qLogNoisyExpectedHypervolumeImprovement,
)

import os
from botorch.models.gp_regression import (
    HeteroskedasticSingleTaskGP,
)

from gpytorch.mlls.exact_marginal_log_likelihood import ExactMarginalLogLikelihood
from botorch.utils.transforms import unnormalize
from botorch.fit import fit_gpytorch_mll

SMOKE_TEST = os.environ.get("SMOKE_TEST")
from botorch.sampling.normal import SobolQMCNormalSampler
from ref_point import RefPoint

tkwargs = {
    "dtype": torch.double,
    "device": torch.device("cuda" if torch.cuda.is_available() else "cpu"),
}

NUM_RESTARTS = 10 if not SMOKE_TEST else 2
RAW_SAMPLES = 512 if not SMOKE_TEST else 4
MC_SAMPLES = 128 if not SMOKE_TEST else 16

batch_size = 1
standard_bounds = torch.zeros(2, 2, **tkwargs)
standard_bounds[1] = 1

from problems.k1 import K5

# d=2
# m=2
problem = K5(sigma=0.7, repeat_eval=3)
dim = problem.dim
ref_point_handler = RefPoint(
    "k5", problem.dim, problem.num_objectives, n_init_sample=100, seed=0
)

ref_point = ref_point_handler.get_ref_point(is_botorch=True)

def evaluate_function(X):
    # Shape X (b, d)
    # Shape Y (b, m, n_w)
    Y = problem.evaluate_repeat(unnormalize(X, problem.bounds)).view(*X.shape[:-1], 1)
    return Y

def train_model(train_X, train_Y):
    # define models for objective and constraint
    model = HeteroskedasticSingleTaskGP(
        train_X=train_X,
        train_Y=train_Y,
        input_transform=...,  # AppendFeatures??
        outcome_transform=...,
    )
    mll = ExactMarginalLogLikelihood(model.likelihood, model)
    fit_gpytorch_mll(mll, max_retries=5)
    return mll, model

sampler = SobolQMCNormalSampler(sample_shape=torch.Size([MC_SAMPLES]))

X_init = torch.rand(10, 2, **tkwargs)
Y_init = evaluate_function(X_init)  # shape (10, 2, 3) (3 repeats)

model = train_model(X_init, Y_init)

objective = ...  # Simple mean-noise trade-off or MVAR or MARS objective

def solve(X, Y):
    acq_func = qLogNoisyExpectedHypervolumeImprovement(
        model=model,
        ref_point=ref_point,
        objective=objective,
        X_baseline=torch.from_numpy(X).to(**tkwargs),
        prune_baseline=True,  # prune baseline points that have estimated zero probability of being Pareto optimal
        sampler=sampler,
    )

    X_cand, Y_cand_pred = optimize_acqf(
        acq_function=acq_func,
        bounds=standard_bounds,
        q=batch_size,
        num_restarts=NUM_RESTARTS,
        raw_samples=RAW_SAMPLES,  # used for intialization heuristic
        options={"batch_limit": batch_size, "maxiter": 2000},
        sequential=True,
    )

    return X_cand, Y_cand_pred

# BO Loop
...

System Info

Please provide information about your setup, including

sdaulton commented 8 months ago

if I fit my hetGP to Y.mean() and Y.std(). I want it to behave more risk-averse though.

One problem with this approach is that qLogNEHVI will attempt to optimize the mean of your stochastic objectives, ignoring the stochasticity and your desire to be risk-averse.

I suspect the correct/better approach is not to just fit my GP to something like (Y_mean-beta*Y_var) directly.

Yeah, directly modeling Y_mean-beta*Y_std (or Y_var) is one option. qLogNEHVI would then seek to identify the Pareto frontier over the lower bound.

Another alternative would be to fit the heteroskedastic GP to Y.mean() and Y.std(). The in qLogNEHVI set observation_noise=True (this would require changing the code here: https://github.com/pytorch/botorch/blob/6a66c79ea444b992aefdbc8d563c6e19d7657008/botorch/acquisition/multi_objective/logei.py#L456 to pass observation_noise=True), so that we sample from posterior predictive p(Y | D) rather than posterior p(f|D). Then apply aGenericMCMultiOutputObjective(https://github.com/pytorch/botorch/blob/6a66c79ea444b992aefdbc8d563c6e19d7657008/botorch/acquisition/multi_objective/objective.py#L57) to those samples thatbetasamples.std(dim=0)` from each sample. This method would be advantageous if it is easier to model Y_mean and Y_std with the heteroskedastic GP, than modeling Y_mean-betaY_std.