cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.57k stars 562 forks source link

GridInterpolationKernel wrapped over product kernels with active dims #357

Closed sumitsk closed 6 years ago

sumitsk commented 6 years ago

This issue can be best explained with the following code snippets:

This works fine:

cov1 = gpytorch.kernels.RBFKernel(active_dims=(0,))
cov2 = gpytorch.kernels.RBFKernel(active_dims=(1,))
x = torch.rand(10,2)
cov = cov1 * cov2 
res1 = cov(x).evaluate()

This does not:

grid_cov = gpytorch.kernels.GridInterpolationKernel(cov, grid_size=100, num_dims=2)
res2 = grid_cov(x).evaluate()

Is it not possible to wrap GridInterpolationKernel over such a custom-designed covariance module?

gpleiss commented 6 years ago

Short answer: I don't think this should error out, but you should probably be using a ProductStructureKernel instead, which will be much more efficient/faster.

Can you return the error that you're seeing?

Long answer: correct me if I'm wrong, but the kernel you're constructing has product structure (i.e. k(x, x') = k_1(x_1, x_1') k_2(x_2, x_2')). Rather than creating a 2D grid around k(x, x'), it's much more efficient to create two 1D grids around k_1(x_1, x_1') and k_2(x_2, x_2'). 1D grids are inherently much more efficient.

Two ways you could accomplish this:

Slow way: Wrap cov1 and cov2 in GridInterpolationKernels, and then multiply those kernels together.

Fast way: Use gpytorch.kernels.ProductStructureKernel. This batches the computation of cov1 and cov2, and then computes the product for you. See the SKIP example notebook for an example of how to use this.

jacobrgardner commented 6 years ago

The reason it errors out is because GridKernel calls the base kernel with batch_dims=(0, 2) because it wants d sub kernels that it can Kronecker product together. A ProductKernel with active_dims like this destroys the d dimension.

@gpleiss is correct that ProductStructureKernel is probably the way to go for your actual use case, but the actual bug to come out of this is probably that ProductKernel and AdditiveKernel should handle batch_dims correctly.

jacobrgardner commented 6 years ago

Actually, I'm not sure it'll be possible to wrap a ProductKernel with active_dims in a GridInterpolationKernel. I think wrapping the GridInterpolationKernels themselves is the way to go, and I'll think about this.