pytorch / botorch

Bayesian optimization in PyTorch
https://botorch.org/
MIT License
3.07k stars 393 forks source link

Pairwise comparison with fantasies #733

Open fraiori0 opened 3 years ago

fraiori0 commented 3 years ago

Dear BoTorch developers,

I'm having some issues related to the use of different acquisition functions with a pairwise_gp model. Specifically, the issue arises when the user uses acquisition functions that need to use fantasies (namely Knowledge Gradient, or Max-value Entropy Search with q>1 )

Issue description

The error is thrown by the condition_on_observations() method of pairwise_gp, which, in turn, is called by the fantasize() method of the Model class. The error appears when fantasize() makes a call to condition_on_observations() and passes a Y value which contains non-pairwise comparisons.

We tried to address this issue by adding to pairwise_gp a method that transforms observations into pairwise comparisons (similar to the function generate_comparisons() that is used in the tutorial on preference BO). This method takes as an input the output tensor from fantasize() and output a tensor with the same shape (except for the last dimension) containing the comparisons between the point of the fantasies. However, some other errors are raised, related to improper tensors shape, during the call to set_train_data(), at the end of condition_on_observations().

Do you have any suggestions on the right way to address this issue?

Best, Francesco

System Info

Please provide information about your setup, including

Balandat commented 3 years ago

Hi @fraiori0, glad you're interested in using the pairwise GP model.

I am not sure to what extent we have tested the pairwise GP model for fantasizing and using it with KG/MES-like acquisition functions (cc. @ItsMrLin, @danielrjiang, @sdsingh)

That said, the approach with generating the pairwise comparisons and then feeding those into the model should be the right one. Could you share a simple reproducible example that causes the other errors (such as improper shapes)? We can take a look what would be needed to fix this.

fraiori0 commented 3 years ago

Dear @Balandat, thanks for your comment.

Here you can find the code for a reproducible example, both the modified methods and a script to generate the error (part of the code comes from the tutorial). The error message is a RuntimeError raised by an out-of-bound index, when new_model.set_train_data() calls sub_D.scatter_(1, comp_view[i, :, [0]], 1).

Modified methods to be inserted in the PairwiseGP class, contained in the pairwise_gp.py file of the library.

def generate_comparisons(self, y):
        """  Create pairwise comparisons with noise """
        # Input y has shape [(fantasies,) raw_samples, points per batch, dim(utility(x))]
        # Generate all possible pairs of elements in y (over the raw_samples dimension)
        all_pairs = np.array(list(combinations(range(y.shape[-3]), 2)))
        # Select N=raw_samples combinations, use same pairs for each fantasy
        comp_pairs = all_pairs[np.random.choice(range(len(all_pairs)), y.shape[-3], replace=False)]
        y_comp = torch.from_numpy(comp_pairs).expand(*y.shape[:-3], y.shape[-2], *comp_pairs.shape).transpose(-2,-3)
        # Switch pairs indexes if needed
        c0 = y[..., comp_pairs[...,0], :, :]
        c1 = y[..., comp_pairs[...,0], :, :]
        reverse_comp = (c0 < c1).numpy()
        y_comp[reverse_comp[...,0]] = torch.flip(y_comp[reverse_comp[...,0]], (-1,))
        y_comp = torch.tensor(y_comp).long()
        return y_comp

    def condition_on_observations(self, X: Tensor, Y: Tensor, **kwargs: Any) -> Model:
        r"""Condition the model on new observations.

        Note that unlike other BoTorch models, PairwiseGP requires Y to be
        pairwise comparisons

        Args:
            X: A `batch_shape x n x d` dimension tensor X
            Y: A tensor of size `batch_shape x m x 2`. (i, j) means
                f_i is preferred over f_j

        Returns:
            A (deepcopied) `Model` object of the same type, representing the
            original model conditioned on the new observations `(X, Y)`.
        """
        new_model = deepcopy(self)

        if self._has_no_data():
            # If the model previously has no data, set X and Y as the data directly
            new_model.set_train_data(X, Y, update_model=True)
        else:
            # Can only condition on pairwise comparisons instead of the directly
            # observed values. Raise a RuntimeError if Y is not a tensor presenting
            # pairwise comparisons
            if Y.shape[-1] == 1:
                # If the model has 1-dim output, generate pairwise comparisons
                Y = self.generate_comparisons(Y)
            if Y.dtype in (float32, float64) or Y.shape[-1] != 2:
                raise RuntimeError(
                    "Conditioning on non-pairwise comparison observations."
                )
            Y_new_batch_shape = Y.shape[:-2]
            new_datapoints = self.datapoints.expand(
                Y_new_batch_shape + self.datapoints.shape[-2:]
            )
            new_comparisons = self.comparisons.expand(
                Y_new_batch_shape + self.comparisons.shape[-2:]
            )

            # Reshape X since Y may have additional batch dim. from fantasy models
            X = X.expand(Y_new_batch_shape + X.shape[-2:])

            new_datapoints = torch.cat((new_datapoints, X.to(new_datapoints)), dim=-2)

            shifted_comp = Y.to(new_comparisons) + self.n
            new_comparisons = torch.cat((new_comparisons, shifted_comp), dim=-2)

            # TODO: be smart about how we can update covar matrix here
            new_model.set_train_data(new_datapoints, new_comparisons, update_model=True)
        return new_model

Python script which tries to apply pairwise_g with KG acquisition function

import numpy as np
import torch
from botorch.test_functions import Shekel
from itertools import combinations

from botorch.models.pairwise_gp import PairwiseGP, PairwiseLaplaceMarginalLogLikelihood
from botorch.fit import fit_gpytorch_model

from botorch.acquisition.knowledge_gradient import qKnowledgeGradient
from botorch.acquisition import PosteriorMean
from botorch.optim import optimize_acqf

def init_and_fit_model(X, comp, state_dict=None):
    """ Model fitting helper function """
    model = PairwiseGP(X, comp)
    mll = PairwiseLaplaceMarginalLogLikelihood(model)
    fit_gpytorch_model(mll)
    return mll, model

def generate_data(n, xmin, xmax, noise=0.1, dim=4):
    """ Generate data X and y """
    utility = Shekel(m=10, noise_std=noise)
    X = xmin + (xmax-xmin)*torch.rand(n, dim, dtype=torch.float64)
    y = utility(X)
    return X, y

def generate_comparisons(y, n_comp, replace=False, use_all=False):
    """  Create pairwise comparisons with noise """
    # generate all possible pairs of elements in y
    all_pairs = np.array(list(combinations(range(y.shape[0]), 2)))
    # randomly select n_comp pairs from all_pairs, use every possible pairs if use_all=True
    if use_all:
        comp_pairs=all_pairs
    else:
        comp_pairs = all_pairs[np.random.choice(range(len(all_pairs)), n_comp, replace=replace)]
    c0 = y[comp_pairs[:, 0]]
    c1 = y[comp_pairs[:, 1]]
    reverse_comp = (c0 < c1).numpy()
    comp_pairs[reverse_comp, :] = np.flip(comp_pairs[reverse_comp, :], 1)
    comp_pairs = torch.tensor(comp_pairs).long()
    return comp_pairs

noise = 0.05
xmin = torch.tensor([0.0, 0.0, 0.0, 0.0])
xmax = torch.tensor([10.0, 10.0, 10.0, 10.0])
bounds = torch.stack([xmin, xmax])
# Noise-free utility function
utility = Shekel(m=10, noise_std=noise)

# Create initial data
init_X, init_y = generate_data(q,xmin=xmin, xmax=xmax,noise=noise)
comparisons = generate_comparisons(init_y, n_comp=0, use_all=True)

# Fit model on initial data
data = (init_X, comparisons)
_, model = init_and_fit_model(init_X, comparisons)

acq_func = qKnowledgeGradient(
    model=model,
    num_fantasies=64
)
# Optimize and get new observation --> THIS PART RAISES THE ERROR
next_X, acq_value = optimize_acqf(
    acq_function=acq_func, 
    bounds=bounds,
    q=1,
    num_restarts=4,
    raw_samples=128,
)
Balandat commented 3 years ago

Hi @fraiori0, sorry this fell off the radar somehow. I am not particularly familiar with this part of the code, but it looks like the issue here is that the comparisons that your code generates have some index issues. Specifically, running your code it seems that the comparisons that show up in set_train_data have indices up to 137, which is much larger than the number of data points. This is what leads to the error you're seeing. Note that fantasize adds another batch dimension, which may mess things up here.