BUG: MBM sums up MOO outputs when given a single objective acquisition function

saitcakmak commented 3 months ago

If the generation strategy uses MBM with a single objective acquisition function on an MOO problem, the outputs are simply summed together in the acquisition function using a ScalarizedPosteriorTransform.

Discovered while investigating https://github.com/facebook/Ax/issues/2514

Repro:

Notebook for Meta employees: N5489742

Setup the problem using AxClient

import random

from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
from ax.service.ax_client import AxClient, ObjectiveProperties
from botorch.acquisition.monte_carlo import qNoisyExpectedImprovement

generation_strategy = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.SOBOL,
            num_trials=2,
            min_trials_observed=1,
        ),
        GenerationStep(
            model=Models.BOTORCH_MODULAR,
            num_trials=-1,
            model_kwargs={
                "botorch_acqf_class": qNoisyExpectedImprovement,
            },
        ),
    ]
)

ax_client = AxClient(generation_strategy=generation_strategy)

ax_client.create_experiment(
    name="test_experiment",
    parameters=[
        {
            "name": "x1",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    objectives={
        "a": ObjectiveProperties(
            minimize=False,
        ),
        "b": ObjectiveProperties(
            minimize=False,
        ),
    },
)

def evaluate(parameters):
    return {"a": (random.random(), 0.0), "b": (random.random(), 0.0)}

for i in range(5):
    parameterization, trial_index = ax_client.get_next_trial()
    ax_client.complete_trial(
        trial_index=trial_index, raw_data=evaluate(parameterization)
    )

This runs fine and generates candidates.

Investigate arguments to acquisition function

from unittest import mock
with mock.patch.object(qNoisyExpectedImprovement, "__init__", side_effect=Exception) as mock_acqf:
    parameterization, trial_index = ax_client.get_next_trial()

This will raise an exception. Ignore it and check kwargs.

mock_acqf.call_args.kwargs["posterior_transform"]

This is a ScalarizedPosteriorTransform with weights tensor([1., 1.], dtype=torch.float64).

We can check opt config to verify that this is not an experiment setup issue.

ax_client.experiment.optimization_config
# MultiObjectiveOptimizationConfig(objective=MultiObjective(objectives=[Objective(metric_name="a", minimize=False), Objective(metric_name="b", minimize=False)]), outcome_constraints=[], objective_thresholds=[])`

Expected behavior

We can't do MOO using a single objective acquisition function. We should not be silently scalarizing the outputs. It should raise an informative error.

Abrikosoff commented 3 months ago

Actually, in my (very limited) knowledge, isn't this how MOBO is supposed to work? If you look at the BoTorch documentation for MOBO, especially where the model is initialized, you find:

def initialize_model(train_x, train_obj):
    # define models for objective and constraint
    train_x = normalize(train_x, problem.bounds)
    models = []
    for i in range(train_obj.shape[-1]):
        train_y = train_obj[..., i : i + 1]
        train_yvar = torch.full_like(train_y, NOISE_SE[i] ** 2)
        models.append(
            FixedNoiseGP(
                train_x, train_y, train_yvar, outcome_transform=Standardize(m=1)
            )
        )
    model = ModelListGP(*models)
    mll = SumMarginalLogLikelihood(model.likelihood, model)
    return mll, model

(in our case we are discussing SingleTaskGPs, but this does not change the nature of the problem, I think). If this tracks I would think that a ModelList containing two SingleTaskGPs would be the way to do MOBO as well, no?

Edit: This is the case for qNParEGO at least, for example, which uses Chebyshev scalarization, as far as I can see.

saitcakmak commented 3 months ago

So, the part about constructing a multi-output surrogate model is correct. That should indeed happen. The issue is scalarizing the outputs from the model, using an arbitrary sum. We do support ScalarizedObjective, which is the intentional way of doing this, where the user can specify the scalarization weights.

qNParEGO is also different here, since it is a multi-objective acquisition function that internally uses a scalarization. It is defined to work with multiple objectives and the behavior is by design.

The issue is doing this silently using arbitrary weights (well, they're just 1 for maximization and -1 for minimization) with acquisition functions that are not designed for multi-objective optimization.

saitcakmak commented 2 months ago

The same issue happens with the legacy models as well. It is a problem with the way we extract the objective weights from optimization config in TorchModelBridge._get_transformed_model_gen_args. This just converts the optimization config into an array of weights. The model has no idea whether those weights represent a scalarized objective or multi-objective.

For legacy single-objective models, these get passed to _get_acquisition_function -> get_objective_weights_transform, which treats them as scalarization weights, and uses them to construct a GenericMCObjective (which hides the original weights in some local function).

For MBM with single-objective acquisition functions, these are passed through Acquisition.get_botorch_objective_and_transform, which uses them to construct an objective or posterior transform depending on whether there are constraints and the type of acquisition function being used.

facebook / Ax