Closed Rashfu closed 9 months ago
These two are essentially the same model - In the Batch GP Regression case the model structure is just a single-output model with an additional batch dim that corresponds to the different functions you model. In the batch independent multioutput GP the model has explicit (trailing) output dimensions - these are realized by the MultitaskMultivariateNormal
that is being used in the model (hence you need the MultitaskGaussianLikelihood
). The main thing to note here is that the resulting covariance across the outputs is block-diagonal, i.e. the correlations between outputs are not being modeled here. For that you need to use an actual Multitask GP.
I understand that the resulting covariance across the outputs is block-diagonal. But I visualized the covariance of output, and it doesn't appear to be a block diagonal matrix. There are still non-zero values in other positions.
The code is below with train_x [300, 2]
and train_y [300, 3]
:
class BatchIndependentMultitaskGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super().__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean(batch_shape=torch.Size([3]))
self.covar_module = gpytorch.kernels.ScaleKernel(
gpytorch.kernels.RBFKernel(batch_shape=torch.Size([3])),
batch_shape=torch.Size([3])
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
batch_mvn = gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
multitask_mvn = gpytorch.distributions.MultitaskMultivariateNormal.from_batch_mvn(batch_mvn)
return multitask_mvn
likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=3)
model = BatchIndependentMultitaskGPModel(train_x, train_y, likelihood)
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
model.train()
likelihood.train()
training_iterations = 100
for i in range(training_iterations):
optimizer.zero_grad()
output = model(train_x)
loss = -mll(output, train_y)
loss.backward()
The covariance of output=model(train_x)
looks like this:
But I visualized the covariance of output, and it doesn't appear to be a block diagonal matrix. There are still non-zero values in other positions.
The MultitaskMultivariateNormal
has an interleaved
setting:
https://github.com/cornellius-gp/gpytorch/blob/981edd83a671e8dca9d67c91b188702354884a34/gpytorch/distributions/multitask_multivariate_normal.py#L27-L29
So you'll have to reshape the covariance matrix to see the block-diagonal structure w.r.t. the outputs. See here: https://github.com/cornellius-gp/gpytorch/blob/981edd83a671e8dca9d67c91b188702354884a34/gpytorch/distributions/multitask_multivariate_normal.py#L62-L68
🌹 Thank you for your clarification that resolved all my confusion. I will close this issue now !
What is the difference between Batch GP Regression and Batch Independent Multioutput GP?
Assuming the shape of the input x is
[400, 2]
, and the shape of the output y is[400, 3]
, I can either directly use Batch Independent Multioutput GP or I can duplicate the input x to have a shape of[3, 400, 2]
, and adjust the output y to have a shape of[3, 400]
, then utilize Batch GP Regression . The data for both approaches can be exactly the same, with only minor differences in the code.In Batch Independent Multioutput GP, it says that _Unlike in the Multitask GP Example, this do not model correlations between outcomes, but treats outcomes independently_. But the code still uses the
MultitaskMVN
andMultitaskGaussianLikelihood
. While in Batch GP Regression , it says that we do NOT account for any correlations between the different functions being modeled. I am confused about thecorrelations
mentioned in the two tutorials. Are these twocorrelations
the same? I'm not clear about the difference between multioutput and multitask.Thanks in advance !