cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.54k stars 557 forks source link

Questions about covaraince matrix in Multitask GP Regression #1963

Open chemyibinjiang opened 2 years ago

chemyibinjiang commented 2 years ago

Thank you for offering this package and detailed documentation. This package is really helpful, flexible, and easy to use! I was reading the document about Multitask GP Regression (https://docs.gpytorch.ai/en/stable/examples/03_Multitask_Exact_GPs/Multitask_GP_Regression.html#Introduction) with the example. In the end, the results from the test data seem to only show the lower/upper confidence boundary and the predicted mean. But the means/covariance matrix of the multivariate distribution that the predicted value would draw from can offer more information, so I simply checked the mean/covariance with the following commands:

print(predictions.covariance_matrix)
print(predictions.mean)

While the shape of predictions.mean is [51,2], the shape of predictions.covariance_matrix is [102,102]. My question is: How is the covariance matrix defined in Multitask GP Regression? is it defined based on predictions.mean.flatten(), i.e, the predictions were drawn from a multivariate distribution with mean of predictions.mean.flatten() and covaraince matrix of predictions.covariance_matrix, then reshaped to (-1,task number)?

wjmaddox commented 2 years ago

Basically yes, although it depends on if the multi-task MVN is _interleaved or not.

See here for an explanation. But basically, if the covariance is interleaved then for each data point, all of the inter-task covariances are stored together (block diagonal wrt inter-task covariance), while if it's not interleaved, then all of the inter-data point covariances are stored together (block diagonal wrt inter-data covariance).

Hope this helps.