cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.54k stars 557 forks source link

Sparse MultiTask Prediction #1063

Open Evan1578 opened 4 years ago

Evan1578 commented 4 years ago

Hi, I have been trying to implement a sparse gpytorch model with multitask prediction. I am unsure if this is supported by gpytorch (the sparse gpytorch guide did not seem to mention multitask prediction capabilities). Is this kind of model supported? If so, how would I go about implementing it? I have tried constructing a base multitask covariance module and then feeding that into the Inducing Point Kernel, but I am getting errors pretty far down the call stack with this approach.

gpleiss commented 4 years ago

As far as I know, this should be supported functionality in theory. I'm pretty sure it works with GridInterpKernel, and I'm not sure why it wouldn't work with InducingPointKernel.

For now you can try the variational multitask example. Can you post the stack trace (and ideally also an example notebook to reproduce the errors)?

Evan1578 commented 4 years ago

Thanks for the variational multitask tip, I am new to gaussian processes and did not think of that. The stack trace is: Traceback (most recent call last): File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.1\helpers\pydev\pydevd.py", line 1741, in main() File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.1\helpers\pydev\pydevd.py", line 1735, in main globals = debugger.run(setup['file'], None, None, is_module) File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.1\helpers\pydev\pydevd.py", line 1135, in run pydev_imports.execfile(file, globals, locals) # execute the script File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.1\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "C:/Users/Evan Crafts/PycharmProjects/Gaussian_Process/adversarial-navigation/GoalDefender/others/code_evan/testscript.py", line 4, in model.trainmodels('sparse') File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\adversarial-navigation\GoalDefender\MotionPredictor\GaussianProcessMotionPredictor__init__.py", line 201, in trainmodels train() File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\adversarial-navigation\GoalDefender\MotionPredictor\GaussianProcessMotionPredictor__init.py", line 194, in train loss = -mll(output, y_train) File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\module.py", line 24, in call outputs = self.forward(*inputs, *kwargs) File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\mlls\exact_marginal_log_likelihood.py", line 50, in forward output = self.likelihood(function_dist, params) File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\likelihoods\likelihood.py", line 64, in call return self.marginal(input, *args, **kwargs) File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\likelihoods\gaussian_likelihood.py", line 76, in marginal full_covar = covar + noise_covar File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\lazy\lazy_tensor.py", line 1626, in add return AddedDiagLazyTensor(self, other) File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\lazy\added_diag_lazy_tensor.py", line 27, in init__ broadcasting._mul_broadcast_shape(lazy_tensors[0].shape, lazy_tensors[1].shape) File "C:\Users\Evan Crafts\PycharmProjects\Gaussian_Process\venv\lib\site-packages\gpytorch\utils\broadcasting.py", line 20, in _mul_broadcast_shape raise RuntimeError("Shapes are not broadcastable for mul operation") RuntimeError: Shapes are not broadcastable for mul operation ''

Evan1578 commented 4 years ago

the error seems to be a problem with computing the loss, the output of the model is a multivariate normal object with torch size (250,28), the corresponding y label is a 250 times 28 tensor, torch size also (250,28)

gpleiss commented 4 years ago

Can you also include a code example (or, more preferably, a jupyter notebook) that produces this error?

xanderladd commented 4 years ago

I am able to reproduce this if I try to set up a model with multitask mean but not multi task kernel. The error is in computing the loss but more specifcally seems to be that the covariance noise shape does not match the covariance shape. I am a bit new to this... is multitask kernel required if using multitask mean? Could this be fixed by overriding the covariance noise similar to how multitask kernel is?

Here is the model init. basically everything else is the same as https://docs.gpytorch.ai/en/v1.1.1/examples/03_Multitask_Exact_GPs/Multitask_GP_Regression.html but I'd be happy to provide more info if needed.

Screen Shot 2020-07-31 at 4 46 55 PM

If uncomment the multitask covariance it will work.

gpleiss commented 4 years ago

Yes, the multitask kernel is required. The MultitaskMean and MultitaskKernel objects make sure that everything is shaped appropriately for multiple tasks.