cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.54k stars 557 forks source link

Computer model calibration with GP emulator implementation #1686

Open kejzlarv opened 3 years ago

kejzlarv commented 3 years ago

Hi all! I am relatively new to the PyTorch and GPyTorch universe, and I am trying to implement the original computer model calibration framework by Kennedy and O'Hagan (2001): https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/1467-9868.00294.

In a nutshell, I have the following model for data

,

where is an unknown (calibration) parameter common to all , and I consider the following GP priors for unknown functions (computer model) and (discrepancy function)

Besides observations , the dataset also includes a set of model evaluations over a grid of know inputs . Overall, the complete dataset under this framework is with the marginal likelihood

,

where and are the mean vector and covariance matrix given by the GPs and are the hyperparameters of kernel and mean functions.

I have a couple of questions:

  1. How to jointly specify GP model for z's and y's
  2. How to include an unknown parameter as part of the input
  3. Is GPyTorch the way to go (flexible enough to implement this)? Or will Ibe more successful in trying to implement this with Pytorch directly?

I will very much appreciate any advice or opinion! Thanks!

Vojta

wjmaddox commented 3 years ago

Hi, a model like this can definitely be implemented in gpytorch by just modifying an exact GP and changing the kernel to be additive and to use specific input dimensions. Something like the following code snippet would probably work:

Here, I'm using additivity of the Gaussian distributions to write the model as a single GP with an additive kernel,

# We will use the simplest form of GP model, exact inference
class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module_time = gpytorch.means.ConstantMean(active_dims=0)
        self.mean_module_both = gpytorch.means.ConstantMean()
        self.covar_module_time = gpytorch.kernels.RBFKernel(active_dims=0)
        self.covar_module_both = gpytorch.kernels.RBFKernel(ard_num_dims=2)

    def forward(self, x):
        mean_x = self.mean_module_time(x) + self.mean_module_both(x)
        covar_x = self.covar_module_time(x) + self.covar_module_both(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

# initialize likelihood and model
likelihood = gpytorch.likelihoods.GaussianLikelihood()
# the first dimension is assumed to be time and the second \theta
model = ExactGPModel(train_x, train_y, likelihood)

You may also want to look at the Botorch package, which has pre-prepared multi-fidelity models along the lines of the one you describe. For example, this one.

As far as I can tell, you could also implement models of this sort as a form of a deep GP as well, see the example on that.

You could probably optimize for theta jointly by registering it as a parameter in the model initialization class, but I don't know how you would deal with predictions at test time (would theta be set then?)?