Closed famura closed 1 year ago
After a bit more digging, I am starting to believe that this can not work with the existing code base.
So this code snippet from variational_strategy.py
# Covariance terms
num_induc = inducing_points.size(-2)
test_mean = full_output.mean[..., num_induc:]
induc_induc_covar = full_covar[..., :num_induc, :num_induc].add_jitter(self.jitter_val)
induc_data_covar = full_covar[..., :num_induc, num_induc:].to_dense()
data_data_covar = full_covar[..., num_induc:, num_induc:]
Works for the example in the doc since there is only one task, i.e., the shape of the y tensor is (N,)
with N
being the number of training points. Thus, the covar matrix is of shape (N+M, N+M)
with M
being the number of inducing points. This makes the slicing above meaningful.
However, if the task dim T
is > 1, the shape of the covar matrix is (N*T, N*T)
in my other examples and ((N+M)*T, (N+M)*T)
here. Please correct me if I am wrong about this. Consequently, one would need to slice the full_covar
differently, i.e., remove num_induc
for every task.
I am not sure how to proceed now. Unsqueeze to 3 dimensions and "misuse" the batch dim as task dim? But that would remove the correlations between the tasks, right?
Help is very much appreciated. @gpleiss @Balandat
Update: I have no idea how I overlooked the IndependentMultitaskVariationalStrategy class. I updated the associated part of the code snipped above to
class ApproximateMultitaskGPModel(ApproximateGP):
def __init__(self, inducing_points):
assert inducing_points.ndim == 2
variational_distribution = CholeskyVariationalDistribution(inducing_points.size(0))
base_variational_strategy = VariationalStrategy(
self, inducing_points, variational_distribution, learn_inducing_locations=True
)
variational_strategy = IndependentMultitaskVariationalStrategy(
base_variational_strategy, num_tasks=2
)
super().__init__(variational_strategy)
self.mean_module = gpytorch.means.MultitaskMean(
gpytorch.means.ConstantMean(), num_tasks=2
)
self.covar_module = gpytorch.kernels.MultitaskKernel(
gpytorch.kernels.RBFKernel(), num_tasks=2, rank=1
)
Unfortunately, I still get an error at line 216 of variational_strategy.py. However now, we first run through this line of the IndependentMultitaskVariationalStrategy class. So to me it looks like the multitask-wrapping here has essentially no effect.
Furthermore, I am not sure about the inducing_values
which is a tensor of shape (M,)
when crashing. Shouldn't it have two dimensions, e.g., (M, T)
with T
being the number of task dims?
Any ideas?
📚 Documentation/Examples AND/OR :bug: Bug
Is there documentation missing? I could not find out how to make multitask GPs with uncertain inputs work. I started by combining their examples on multitask GP regression and GP regression with uncertain inputs, see the code below. I am not exactly sure what the root cause of my error is. The crash occurs due to a shape mismatch at line 216 of
variational_stategy.py
Is documentation wrong? No
Is there a feature that needs some example code? No
Think you know how to fix the docs? (If so, we'd love a pull request from you!) No, but I think if we fixed the code I put below, we could enrich the doc with another example.