cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.49k stars 547 forks source link

Is it possible to build hierarchical GP regression models using GPyTorch? #996

Open grishabhg opened 4 years ago

grishabhg commented 4 years ago

Hi,

I was wondering if there is some straightforward way of building a hierarchical GP regression model as shown here? This exists in GPy see here and here.

If not, could someone suggest a way to do it?

Thanks in advance.

rhaps0dy commented 4 years ago

Hi! It's currently not possible, but it shouldn't be too hard to do. Looking at the notebook you linked, the sticking point is the Hierarchical kernel, that does not exist in gpytorch.

You need to reimplement GPy's Hierarchical kernel for gpytorch. It is originally implemented here: https://github.com/SheffieldML/GPy/blob/devel/GPy/kern/src/independent_outputs.py#L129 . Pytorch computes gradients for you so you only need the __init__, K and Kdiag functions.

Then you should do something like:

class HierarchicalKernel:
    pass

kern_upper = GPy.kern.Matern32(input_dim=1, variance=1.0, lengthscale=2.0, active_dims=[0], name='upper')
kern_lower = GPy.kern.Matern32(input_dim=1, variance=0.1, lengthscale=4.0, active_dims=[0], name='lower')
k_hierarchy = HierarchicalKernel(kernels=[kern_upper, kern_lower])

but substituting GPy.kern for gpytorch.kernels, and making the appropriate API changes.

If you do get it working, maybe the main devs would like to add it to gpytorch!

gpleiss commented 4 years ago

@grishabhg Sorry for the slow reply. Mathematically, it seems like this kernel can be constructed in matrix form using kronecker products:

k([f_1^1, \ldots, f_1^n, \ldots, f_r^1, \ldots f_r^n]) = kern_lower \kron ones + kern_upper \kron eye`

where ones corresponds to the all ones matrix of size r x r and eye corresponds to the identity matrix of size r x r.

This can be implemented pretty easily using GPyTorch's lazy tensors

class HierarchicalKernel(Kernel):
    # ...
    def forward(self, time, task):
        return SumLazyTensor([
            KroneckerProductLazyTensor(self.kern_lower(task), torch.ones(self.num_tasks, self.num_tasks))),
            KroneckerProductLazyTensor(self.kern_upper(task), torch.eye(self.num_tasks)))
        ])

Then in the model forward function, return a MultitaskMultivariateNormal similar to the multitask examples.

CecileLeSueur commented 2 years ago

Hi,

As @grishabhg, I’m also interested in building the hierarchical GP regression model presented in the same paper. I’m not very familiar with GPytorch, so I would be very grateful for your help!

I have 6 replicates, 10 observations each. train_x.shape => torch.Size([10]) train_y.shape=>torch.Size([10, 6])

According to the last post of @gpleiss and using the Multitask GP Regression notebook I wrote the following model. I'm now trying to optimise the hyperparameters using Adam optimiser. However I get the following error:

RuntimeError: The expected shape of the kernel was torch.Size([60, 60]), but got torch.Size([360, 360]). This is likely a bug in GPyTorch.

Here is the code:

` class HierarchicalGPModel(gpytorch.kernels.Kernel):

# We will register the parameter when initializing the kernel
def __init__(self, NTask, **kwargs):
    super(HierarchicalGPModel, self).__init__()
    self.kern_lower = gpytorch.kernels.RBFKernel()
    self.kern_upper = gpytorch.kernels.RBFKernel()
    self.NTask = NTask

# this is the kernel function
def forward(self, time, task):
    #import pdb; pdb.set_trace()
    return gpytorch.lazy.SumLazyTensor(
        gpytorch.lazy.KroneckerProductLazyTensor(self.kern_lower(task), torch.ones(self.NTask, self.NTask)),
        gpytorch.lazy.KroneckerProductLazyTensor(self.kern_upper(task), torch.eye(self.NTask))
    )

class MultitaskGPModel(gpytorch.models.ExactGP):

def __init__(self, train_x, train_y, likelihood, NTask):

    super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)

    self.mean_module = gpytorch.means.MultitaskMean(
        gpytorch.means.ConstantMean(), num_tasks= NTask
    )

    self.covar_module = gpytorch.kernels.MultitaskKernel(
        HierarchicalGPModel(NTask=NTask), num_tasks= NTask, rank=1
    )

def forward(self, x):
    mean_x = self.mean_module(x)
    covar_x = self.covar_module(x)
    return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x)`

likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks = 6) model = MultitaskGPModel(train_x, train_y, likelihood, 6 )

# Find optimal model hyperparameters model.train() likelihood.train() training_iterations=50

# Use the Adam optimizer optimizer = torch.optim.Adam(model.parameters(), lr=0.1) # Includes GaussianLikelihood parameters

# "Loss" for GPs - the marginal log likelihood mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model) for i in range(training_iterations): optimizer.zero_grad() output = model(train_x.float()) loss = -mll(output, train_y.float()) loss.backward() print('Iter %d/%d - Loss: %.3f' % (i + 1, training_iterations, loss.item())) optimizer.step()

A HUGE thank you for your help!

rlloretb commented 1 year ago

Any updates on this? How can I help?

gpleiss commented 1 year ago

@rlloretb if you want to construct an example notebook that would be much appreciated.

CecileLeSueur commented 1 year ago

Hi @rlloretb, thank you for the comment! I'm currently using the multitask framework as suggested earlier by @gpleiss which I finally got to work. However, I doubt that my implementation is the most efficient, especially regarding the predictions of the different levels of the Hierarchical model. The example notebook would be very nice, and I'm happy to help to construct it (if you are interested of course, e.g. by discussing/sharing what I've been doing so far) :) Thank you!