pytorch / botorch

Bayesian optimization in PyTorch
https://botorch.org/
MIT License
3.08k stars 397 forks source link

[Bug] Unable to apply objectives and constrains on `qHypervolumeKnowledgeGradient` and `qMultiFidelityHypervolumeKnowledgeGradient` #2159

Closed vlad451101 closed 9 months ago

vlad451101 commented 10 months ago

🐛 Bug

I have a constrained 4 input and 5 output multi-objective optimization problem that I have successfully applied to the constrained qNEHVI BO loop. Here I was able to define the first 4 outputs as the main objectives of the optimization and the last one as a constrained-based surrogate model.

I wanted to try the same optimization by using the MF-HVKG. I would like to try both constrained and unconstrained optimization of MF-HVKG. However, for some reason, I can't apply the optimization weights for the first 4 objectives in the acquisition function, and, in the case of constrained optimization, define the last output objective as the constraint of the BO optimization. Here is a little snippet code of what I mean by this using it with qNEHVI on my test case:

acq_func=qLogNoisyExpectedHypervolumeImprovement(
                model=model,
                ref_point=ref_points.tolist(),  # use known reference point
                X_baseline=normalize(train_x, bounds),
                sampler=sampler,
                prune_baseline=True,  # prune baseline points that have estimated zero probability of being Pareto optimal
                # define an objective that specifies which outcomes are the objectives
                objective=IdentityMCMultiOutputObjective(outcomes=[0,1,2,3]),
                # specify that the constraint is on the last outcome
                constraints=[lambda Z: Z[..., -1]],
            )

Unfortunately, if I setup the objective in the qMultiFidelityHypervolumeKnowledgeGradient acquisition function, I get an IndexError. I assume that the same problem may occur with the qHypervolumeKnowledgeGradient, but I have not tried it yet.

My question is whether the MF-HVKG is designed for applying the optimization weights on the selected objectives and output-based constraints like in qNEHVI. If so, am I doing something wrong or is this a bug?

To reproduce

Below you should find a repro code for unconstrained optimization. The following text file training_data.txt contains all the training data.

Code snippet to reproduce

## Libraries
########################################################################################################################
import warnings
import torch
import gpytorch
from botorch.exceptions import BadInitialCandidatesWarning, InputDataWarning
from botorch.utils.transforms import normalize, unnormalize
from botorch.models.transforms.outcome import Standardize
from botorch import fit_gpytorch_mll
from botorch.models.transforms.input import Normalize
from gpytorch.mlls.sum_marginal_log_likelihood import SumMarginalLogLikelihood
from botorch.models.model_list_gp_regression import ModelListGP
from botorch.sampling.normal import SobolQMCNormalSampler
from botorch.models.gp_regression import SingleTaskGP
from botorch.optim.optimize import optimize_acqf
from botorch.acquisition.utils import project_to_target_fidelity
from botorch.models.deterministic import GenericDeterministicModel
from botorch.acquisition.cost_aware import InverseCostWeightedUtility
from botorch.acquisition.fixed_feature import FixedFeatureAcquisitionFunction
from botorch.acquisition.multi_objective.hypervolume_knowledge_gradient import (_get_hv_value_function, qMultiFidelityHypervolumeKnowledgeGradient,)
from botorch.acquisition.multi_objective.objective import IdentityMCMultiOutputObjective
import pandas as pd
from copy import deepcopy
from typing import Callable, Dict
from torch import Tensor
from math import exp
warnings.filterwarnings("ignore", category=BadInitialCandidatesWarning)
warnings.filterwarnings("ignore", category=InputDataWarning)
warnings.filterwarnings("ignore", category=RuntimeWarning)
warnings.filterwarnings("ignore", category=FutureWarning)
########################################################################################################################

## All functions
########################################################################################################################
def cost_func(x):
    """A simple exponential cost function."""
    exp_arg = torch.tensor(4.8, **tkwargs)
    val = torch.exp(exp_arg * x)
    return val

def cost_callable(X: torch.Tensor) -> torch.Tensor:
    r"""Wrapper for the cost function that takes care of shaping
    input and output arrays for interfacing with cost_func.
    This is passed as a callable function to MOMF.

    Args:
        X: A `batch_shape x q x d`-dim Tensor
    Returns:
        Cost `batch_shape x q x m`-dim Tensor of cost generated
        from fidelity dimension using cost_func.
    """

    return cost_func(X[..., -1:])

def inv_transform(u):
    """ define inverse transform to sample from the probability distribution with
    PDF proportional to 1/(c(x))
    u is a uniform(0,1) rv"""
    return (5 / 24 * torch.log(-exp(24 / 5) / (exp(24 / 5) * u - u - exp(24 / 5))))

def getFidelityComponent(train_x: torch.Tensor, n_points: int, tkwargs: dict, fid_samples_n_points=1001) -> torch.Tensor:
    """
    Generates training data with Fidelity dimension sampled
    from a probability distribution that depends on Cost function
    """
    # Array from which fidelity values are sampled
    fid_samples = torch.linspace(0, 1, fid_samples_n_points, **tkwargs)
    # Probability calculated from the Cost function
    prob = 1 / cost_func(fid_samples, tkwargs)
    # Normalizing
    prob = prob / torch.sum(prob)
    # Generating indices to choose fidelity samples
    idx = prob.multinomial(num_samples=n_points, replacement=True)
    train_x[:, -1] = fid_samples[idx]
    # Calls the objective wrapper to generate train_obj
    return train_x

def get_fidelity(x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    """Wrapper around the Objective function to take care of fid_obj stacking"""
    fid = 1 * x[..., -1]  # Getting the fidelity objective values
    fid_out = fid.unsqueeze(-1)
    # Concatenating objective values with fid_objective
    y_out = torch.cat([y, fid_out], -1)
    return y_out

def fid_obj(X: torch.Tensor) -> torch.Tensor:
    """
    A Fidelity Objective that can be thought of as a trust objective.
    Higher Fidelity simulations are rewarded as being more
    trustworthy. Here we consider just a linear fidelity objective.
    """
    fid_obj = 1 * X[..., -1]
    return fid_obj

def changeObjective(tensor, objectives):
    # Copy tensor
    newTensor = deepcopy(tensor)
    # Select columns
    for col,objective in enumerate(objectives):
        # Change absolute value if objective is minimized
        if objective == 'Minimize':
            # Extract the selected column
            column = newTensor[:, col]
            # Negate the values in the column
            negated_column = -column
            # Update the tensor with the negated column
            newTensor[:, col] = negated_column

    return newTensor

def initialize_model(train_x, train_obj):
    # define models for objective and constraint
    models = []
    for train_y in train_obj.T:
        # Inicilize SingleTaskGP model
        model = SingleTaskGP(train_x, train_y.unsqueeze(-1),
                             covar_module=gpytorch.kernels.ScaleKernel(gpytorch.kernels.MaternKernel(nu=2.5, ard_num_dims=len(train_x[0]))),
                             likelihood=gpytorch.likelihoods.GaussianLikelihood(noise_constraint=gpytorch.constraints.Positive()),
                             input_transform=Normalize(d=train_x.shape[-1]),
                             outcome_transform=Standardize(m=1))

        # Append model to list
        models.append(model)

    # Create model list
    model = ModelListGP(*models)
    mll = SumMarginalLogLikelihood(model.likelihood, model)
    return mll, model

def get_current_value(model: SingleTaskGP, ref_point: torch.Tensor, bounds: torch.Tensor, normalized_target_fidelities: Dict[int, float]):
    """Helper to get the hypervolume of the current hypervolume maximizing set."""
    fidelity_dims, fidelity_targets = zip(*normalized_target_fidelities.items())
    # optimize
    non_fidelity_dims = list(set(range(dim_x)) - set(fidelity_dims))
    curr_val_acqf = FixedFeatureAcquisitionFunction(
        acq_function=_get_hv_value_function(
            model=model,
            ref_point=ref_point,
            # define an objective that specifies which outcomes are the objectives
            objective=IdentityMCMultiOutputObjective(outcomes=[0, 1, 2, 3]),
            sampler=SobolQMCNormalSampler(sample_shape=torch.Size([NUM_INNER_MC_SAMPLES]), resample=False, collapse_batch_dims=True,),
            use_posterior_mean=True,),
            d=dim_x,
            columns=fidelity_dims,
            values=fidelity_targets,)
    # optimize
    _, current_value = optimize_acqf(
        acq_function=curr_val_acqf,
        bounds=bounds[:, non_fidelity_dims],
        q=NUM_PARETO,
        num_restarts=1,
        raw_samples=1024,
        return_best_only=True,
        options={"nonnegative": True},
    )
    return current_value

def project(X: Tensor) -> Tensor:
    return project_to_target_fidelity(X=X, d=project_d, target_fidelities=normalized_target_fidelities,)

def optimize_HVKG_and_get_obs(model: SingleTaskGP, ref_point: torch.Tensor, standard_bounds: torch.Tensor, BATCH_SIZE: int, cost_call: Callable[[torch.Tensor], torch.Tensor],):
    """Utility to initialize and optimize HVKG."""
    cost_model = GenericDeterministicModel(cost_call)
    cost_aware_utility = InverseCostWeightedUtility(cost_model=cost_model)
    current_value = get_current_value(
        model=model,
        ref_point=ref_point,
        bounds=standard_bounds,
        normalized_target_fidelities=normalized_target_fidelities,
    )
    acq_func = qMultiFidelityHypervolumeKnowledgeGradient(
        model=model,
        ref_point=ref_point,  # use known reference point
        num_fantasies=NUM_FANTASIES,
        num_pareto=NUM_PARETO,
        current_value=current_value,
        cost_aware_utility=cost_aware_utility,
        target_fidelities=normalized_target_fidelities,
        project=project,
        # define an objective that specifies which outcomes are the objectives
        objective=IdentityMCMultiOutputObjective(outcomes=[0, 1, 2, 3]),
        constraints=[lambda Z: Z[..., -1]],
    )
    # Optimization
    candidates, vals = optimize_acqf(
        acq_function=acq_func,
        bounds=standard_bounds,
        q=BATCH_SIZE,
        num_restarts=NUM_RESTARTS,
        raw_samples=RAW_SAMPLES,  # used for intialization heuristic
        options={"batch_limit": 5},
    )

    candidates = unnormalize(candidates.detach(), bounds=bounds)
    # if the AF val is 0, set the fidelity parameter to zero
    if vals.item() == 0.0:
        candidates[:, -1] = 0.0
    # observe new values
    return candidates
########################################################################################################################

## Inputs and others
########################################################################################################################
# Path to Excel file
pathResults = r'training_data.txt'

# Selected input variables for Bayesian optimization loop
inputs = ['input1', 'input2', 'input3', 'input4']

# Selected input variables for Bayesian optimization loop
outputs = ['output1', 'output2', 'output3', 'output4', 'output5']

# Dimensions for MOMF optimization
dim_xMO = len(inputs)  # Input Dimension for MOMF optimization
dim_yMO = len(outputs)  # Output Dimension for MO-only optimization
# Dimensions for MO-only optimization
dim_x = dim_xMO + 1  # Input Dimension for MO-only optimization
dim_y = dim_yMO + 1  # Output Dimesnion for MOMF optimization

# Output optimization objectives
objectives = ['Maximize', 'Maximize', 'Maximize', 'Minimize', 'None']

# Kwargs for torch values
tkwargs = {
    "dtype": torch.double,
    "device": torch.device("cuda" if torch.cuda.is_available() else "cpu"),
}

# Selected bounds for variables
boundsMO = torch.tensor([[120, 9, 0.6, 23.333333333],
                            [240, 21, 2.7, 46.666666667]], **tkwargs) # For MO only
bounds = torch.tensor([[120, 9, 0.6, 23.333333333, 0],
                            [240, 21, 2.7, 46.666666667, 1]], **tkwargs) # For MOMF

# Bounds for MOMF optimization
standard_bounds = torch.tensor([[0.0] * dim_x, [1.0] * dim_x], **tkwargs)

# Selected reference points for objectives
ref_point1 = [4.2, 68, 0.43, 0, 0.7] # For MOMF Hypervolume calculation
ref_point = [4.2, 68, 0.43, 0] # For MOMF Hypervolume calculation

# Bayesian optimization loop variables
INIT_SIZE = 100 # Number of randomly generated paralel calculactions for initial training data
BATCH_SIZE = 19 # Number of parallel calculations for Bayesian optimization loop
NUM_RESTARTS = 1 # Number of restarts of acquisition optimizer
RAW_SAMPLES = 1024 # Number of samples for acquisition optimizer
MC_SAMPLES = 1024 # Number of samples for monte-carlo samplerer
NUM_INNER_MC_SAMPLES = 1024
NUM_PARETO = 10
NUM_FANTASIES = 32

# mapping from index to target fidelity (highest fidelity)
target_fidelities = {dim_xMO: 1.0}

normalized_target_fidelities = {}
for idx, fidelity in target_fidelities.items():
    lb = standard_bounds[0, idx].item()
    ub = standard_bounds[1, idx].item()
    normalized_target_fidelities[idx] = (fidelity - lb) / (ub - lb)

project_d = dim_x
########################################################################################################################

## Optimization loop
########################################################################################################################
# Get all data
data = pd.read_csv(pathResults, sep=' ', header=0)
# Generate random fidelity data
train_fid = inv_transform(torch.rand(INIT_SIZE, 1, **tkwargs))
# Get together input data and fidelity component
train_x = torch.cat((normalize(torch.tensor(data[inputs].values, **tkwargs), boundsMO), train_fid), dim=1)
# Calls the objective wrapper to load train_obj and get the fidelity - normaly hear I evaluate the true function
train_obj = torch.tensor(data[outputs].values, **tkwargs)
# Change absolute value of training data - for optimization purposes
train_obj_C = changeObjective(train_obj, objectives)

# Initilize GP models on random data
mll, model = initialize_model(train_x, train_obj_C)

# Fit the model
fit_gpytorch_mll(mll)

# optimize acquisition functions and get new observations
new_x = optimize_HVKG_and_get_obs(
    model=model,
    ref_point=torch.tensor(ref_point, **tkwargs),
    standard_bounds=standard_bounds,
    BATCH_SIZE=BATCH_SIZE,
    cost_call=cost_callable,
)
print(new_x)

Stack trace/error message

Traceback (most recent call last):
  File "C:\Users\Uziv\Desktop\Botorch_algorithms\MC_BO_MOMF\MF_HVKG_test_case.py", line 284, in <module>
    new_x = optimize_HVKG_and_get_obs(
  File "C:\Users\Uziv\Desktop\Botorch_algorithms\MC_BO_MOMF\MF_HVKG_test_case.py", line 185, in optimize_HVKG_and_get_obs
    candidates, vals = optimize_acqf(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\optimize.py", line 563, in optimize_acqf
    return _optimize_acqf(opt_acqf_inputs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\optimize.py", line 584, in _optimize_acqf
    return _optimize_acqf_batch(opt_inputs=opt_inputs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\optimize.py", line 274, in _optimize_acqf_batch
    batch_initial_conditions = opt_inputs.get_ic_generator()(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\initializers.py", line 736, in gen_one_shot_hvkg_initial_conditions
    ics = gen_batch_initial_conditions(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\initializers.py", line 417, in gen_batch_initial_conditions
    Y_rnd_curr = acq_function(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\utils\transforms.py", line 259, in decorated
    output = method(acqf, X, *args, **kwargs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\acquisition\multi_objective\hypervolume_knowledge_gradient.py", line 449, in forward
    fantasy_model = self.model.fantasize(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\models\model.py", line 666, in fantasize
    sampler.samplers[i] if isinstance(sampler, ListSampler) else sampler
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\container.py", line 295, in __getitem__
    return self._modules[self._get_abs_string_index(idx)]
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\container.py", line 285, in _get_abs_string_index
    raise IndexError('index {} is out of range'.format(idx))
IndexError: index 4 is out of range

Expected Behavior

I should be able to apply optimization weigths for selected output parameters using objective argument in the qMultiFidelityHypervolumeKnowledgeGradient acquisition function and additionally define constraint for acquisition function.

System information

Please complete the following information:

Additional context

Add any other context about the problem here.

vlad451101 commented 10 months ago

Just a minor update on my part. Exactly as I expected, the same error occurs in the case of simple HVKG.

Any help or guidance on this matter will be highly appreciated!

Below you can find the repro code for the qHypervolumeKnowledgeGradient case. Training data are listed in the original post.

## Libraries
########################################################################################################################
import sys
import warnings
import torch
import gpytorch
from botorch.exceptions import BadInitialCandidatesWarning, InputDataWarning
from botorch.utils.transforms import normalize, unnormalize
from botorch.models.transforms.outcome import Standardize
from gpytorch.mlls.sum_marginal_log_likelihood import SumMarginalLogLikelihood
from botorch.models.model_list_gp_regression import ModelListGP
from botorch.acquisition.multi_objective.objective import IdentityMCMultiOutputObjective
from botorch import fit_gpytorch_mll
from botorch.models.gp_regression import SingleTaskGP
from botorch.acquisition.cost_aware import InverseCostWeightedUtility
from botorch.acquisition.multi_objective.hypervolume_knowledge_gradient import (
    _get_hv_value_function,
    qHypervolumeKnowledgeGradient,
)
from botorch.optim.optimize import optimize_acqf
from botorch.models.cost import FixedCostModel
import pandas as pd
from copy import deepcopy
import botorch
warnings.filterwarnings("ignore", category=BadInitialCandidatesWarning)
warnings.filterwarnings("ignore", category=InputDataWarning)
warnings.filterwarnings("ignore", category=RuntimeWarning)
warnings.filterwarnings("ignore", category=FutureWarning)
########################################################################################################################

## All functions
########################################################################################################################
def changeObjective(tensor, objectives):
    # Copy tensor
    newTensor = deepcopy(tensor)
    # Select columns
    for col,objective in enumerate(objectives):
        # Change absolute value if objective is minimized
        if objective == 'Minimize':
            # Extract the selected column
            column = newTensor[:, col]
            # Negate the values in the column
            negated_column = -column
            # Update the tensor with the negated column
            newTensor[:, col] = negated_column

    return newTensor

def initialize_model(train_x_list, train_obj_list, bounds):
    # define models for objective and constraint
    train_x_list = [normalize(train_x, bounds) for train_x in train_x_list]
    # Create models
    models = []
    for x_data, y_data in zip(train_x_list, train_obj_list):
        model = SingleTaskGP(x_data, y_data,
                             covar_module=gpytorch.kernels.ScaleKernel(gpytorch.kernels.MaternKernel(nu=2.5, ard_num_dims=len(x_data[0]))),
                             likelihood=gpytorch.likelihoods.GaussianLikelihood(noise_constraint=gpytorch.constraints.Positive()),
                             outcome_transform=Standardize(m=1))
        # Append models
        models.append(model)
    # Setup model list
    model = ModelListGP(*models)
    mll = SumMarginalLogLikelihood(model.likelihood, model)
    return mll, model

def get_current_value(model, ref_point, bounds,):
    """Helper to get the hypervolume of the current hypervolume
    maximizing set.
    """
    curr_val_acqf = _get_hv_value_function(
        model=model,
        ref_point=ref_point,
        use_posterior_mean=True,
        objective=IdentityMCMultiOutputObjective(outcomes=[0, 1, 2, 3]),
    )
    _, current_value = optimize_acqf(
        acq_function=curr_val_acqf,
        bounds=bounds,
        q=NUM_PARETO,
        num_restarts=20,
        raw_samples=1024,
        return_best_only=True,
        options={"batch_limit": 6},
    )
    return current_value

def optimize_HVKG_and_get_obs_decoupled(model, ref_points):
    """Utility to initialize and optimize HVKG."""
    cost_aware_utility = InverseCostWeightedUtility(cost_model=cost_model)

    current_value = get_current_value(
        model=model,
        ref_point=torch.tensor(ref_points, **tkwargs),
        bounds=standard_bounds,
    )

    acq_func = qHypervolumeKnowledgeGradient(
        model=model,
        ref_point=torch.tensor(ref_points, **tkwargs),  # use known reference point
        num_fantasies=NUM_FANTASIES,
        num_pareto=NUM_PARETO,
        current_value=current_value,
        cost_aware_utility=cost_aware_utility,
        objective=IdentityMCMultiOutputObjective(outcomes=[0, 1, 2, 3]),
    )

    # optimize acquisition functions and get new observations
    objective_vals = []
    objective_candidates = []
    for objective_idx in objective_indices:
        # set evaluation index to only condition on one objective
        # this could be multiple objectives
        X_evaluation_mask = torch.zeros(BATCH_SIZE, len(objective_indices), dtype=torch.bool, device=standard_bounds.device,)
        for i in range(BATCH_SIZE):
            X_evaluation_mask[i, objective_idx] = 1
        acq_func.X_evaluation_mask = X_evaluation_mask
        candidates, vals = optimize_acqf(
            acq_function=acq_func,
            num_restarts=NUM_HVKG_RESTARTS,
            raw_samples=RAW_SAMPLES,
            bounds=standard_bounds,
            q=BATCH_SIZE,
            sequential=False,
            options={"batch_limit": 5},
        )
        objective_vals.append(vals.view(-1))
        objective_candidates.append(candidates)
    best_objective_index = torch.cat(objective_vals, dim=-1).argmax().item()
    eval_objective_indices = [best_objective_index]
    candidates = objective_candidates[best_objective_index]
    vals = objective_vals[best_objective_index]
    # observe new values
    new_x = unnormalize(candidates.detach(), bounds=bounds)
    return new_x, eval_objective_indices
########################################################################################################################

## Inputs and others
########################################################################################################################
# Path to Excel file
pathResults = r'training_data.txt'

# Selected input variables for Bayesian optimization loop
inputs = ['input1', 'input2', 'input3', 'input4']

# Selected input variables for Bayesian optimization loop
outputs = ['output1', 'output2', 'output3', 'output4', 'output5']

# define the cost model
objective_costs = {0 : 1.0, 1 : 2.0, 2 : 2.0, 3 : 3.0, 4 : 1.0}
# objective_costs = {0 : 1.0, 1 : 2.0, 2 : 2.0, 3 : 3.0}

# Dimensions for MOMF optimization
dim_x = len(inputs)  # Input Dimension for MOMF optimization
dim_y = len(outputs)  # Output Dimension for MO-only optimization

# Output optimization objectives
objectives = ['Maximize', 'Maximize', 'Maximize', 'Minimize', 'None']

# Kwargs for torch values
tkwargs = {
    "dtype": torch.double,
    "device": torch.device("cuda" if torch.cuda.is_available() else "cpu"),
}

# Selected bounds for variables
bounds = torch.tensor([[120, 9, 0.6, 23.333333333],
                            [240, 21, 2.7, 46.666666667]], **tkwargs) # For MO only

# Bounds for MOMF optimization
standard_bounds = torch.tensor([[0.0] * dim_x, [1.0] * dim_x], **tkwargs)

# Selected reference points for objectives
ref_points = [4.2, 68, 0.43, 0]

# Bayesian optimization loop variables
INIT_SIZE = 100 # Number of randomly generated paralel calculactions for initial training data
BATCH_SIZE = 19 # Number of parallel calculations for Bayesian optimization loop
NUM_RESTARTS = 10 # Number of restarts of acquisition optimizer
RAW_SAMPLES = 256 # Number of samples for acquisition optimizer
MC_SAMPLES = 128 # Number of samples for monte-carlo samplerer
NUM_PARETO = 10
NUM_FANTASIES = 8
NUM_HVKG_RESTARTS = 1
########################################################################################################################

## Optimization loop
########################################################################################################################
# Define the cost model
objective_indices = list(objective_costs.keys())
objective_costs = {int(k): v for k, v in objective_costs.items()}
objective_costs_t = torch.tensor([objective_costs[k] for k in sorted(objective_costs.keys())], **tkwargs)
cost_model = FixedCostModel(fixed_cost=objective_costs_t)

# Get all data
data = pd.read_csv(pathResults, sep=' ', header=0)
# Get together input data
train_x = torch.tensor(data[inputs].values, **tkwargs)
# Calls the objective wrapper to load train_obj
train_obj = torch.tensor(data[outputs].values, **tkwargs)
# Change absolute value of training data - for optimization purposes
train_obj_C = changeObjective(train_obj, objectives)
# Create lists for input and output data
train_obj_C_list = list(train_obj_C.split(1, dim=-1))
train_x_list = [train_x] * len(train_obj_C_list)
# Calculate total cost of input data
total_cost = 0
cost_hvkg = cost_model(train_x).sum(dim=-1)
total_cost += cost_hvkg.sum().item()

# Initilize GP models on random data
mll, model = initialize_model(train_x_list, train_obj_C_list, bounds)

# Fit the model
fit_gpytorch_mll(mll)

# generate candidates
new_x_hvkg, eval_objective_indices_hvkg = optimize_HVKG_and_get_obs_decoupled(model, ref_points)

print(new_x_hvkg)
print(eval_objective_indices_hvkg)

Stack trace/error message

Traceback (most recent call last):
  File "C:\Users\Uziv\Desktop\Botorch_algorithms\MC_BO_HVKG\HVKG_test_case.py", line 221, in <module>
    new_x_hvkg, eval_objective_indices_hvkg = optimize_HVKG_and_get_obs_decoupled(model, ref_points)
  File "C:\Users\Uziv\Desktop\Botorch_algorithms\MC_BO_HVKG\HVKG_test_case.py", line 119, in optimize_HVKG_and_get_obs_decoupled
    candidates, vals = optimize_acqf(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\optimize.py", line 563, in optimize_acqf
    return _optimize_acqf(opt_acqf_inputs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\optimize.py", line 584, in _optimize_acqf
    return _optimize_acqf_batch(opt_inputs=opt_inputs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\optimize.py", line 274, in _optimize_acqf_batch
    batch_initial_conditions = opt_inputs.get_ic_generator()(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\initializers.py", line 736, in gen_one_shot_hvkg_initial_conditions
    ics = gen_batch_initial_conditions(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\optim\initializers.py", line 417, in gen_batch_initial_conditions
    Y_rnd_curr = acq_function(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\utils\transforms.py", line 259, in decorated
    output = method(acqf, X, *args, **kwargs)
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\acquisition\multi_objective\hypervolume_knowledge_gradient.py", line 231, in forward
    fantasy_model = self.model.fantasize(
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\botorch\models\model.py", line 661, in fantasize
    sampler_i = sampler.samplers[i]
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\container.py", line 295, in __getitem__
    return self._modules[self._get_abs_string_index(idx)]
  File "C:\Users\Uziv\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\container.py", line 285, in _get_abs_string_index
    raise IndexError('index {} is out of range'.format(idx))
IndexError: index 4 is out of range
dme65 commented 9 months ago

@sdaulton can you help take a look at this?

sdaulton commented 9 months ago

Hi @vlad451101, This is an edge case we haven't considered (where the objectives are only a subset of the outcomes. Here is a fix for that: https://github.com/pytorch/botorch/pull/2160. Let me know if you have any issues.

Currently, fantasies are generated for all outcomes, even if only a subset of the outcomes are objectives. Hence, you just want to optimize a subset of the outcomes, it would be more efficient to fit the model to only a subset of the outcomes.

Regarding constraints, it worth noting that KG-based methods typically are not used on constrained problems. There is not much work in the literature on KG with constraints.

vlad451101 commented 9 months ago

Hi, @sdaulton Thank you very much, the fix is working very well.

The reason why I want to use only part of the outputs as targets for optimization is that it is much more convenient for me. Normally I work with surrogate models that are even larger, but only a part of them are intended as objectives for optimization.

I was also thinking of fitting the model with only selected objectives. However, my original code is very complex and I wanted to avoid extra unnecessary lines of code. Although, in this case, I think it may not be too challenging to implement it.

For HVKG, I didn't know that it is not used for constrained problems. But thank you for this information.