Closed sumitsk closed 5 years ago
So the two kernels definitely shouldn't need to construct inducing_points
, which was the full set of inducing points on the grid (like, if you had a 10x10x10
grid, inducing_points
would be a 1000x3
tensor). After #345 gets merged, GridKernel
will only depend on base_kernel
and grid
. Whoops!
grid
is a grid_size x num_dims
(like 10x3
) tensor containing the single dimensional grid in each dimension -- that is, grid[:, 0]
contains all of the points on the grid in the first dimension.
For an example of how this is created, see GridInterpolationKernel._create_grid
. You could even get a grid
by creating a GridInterpolationKernel
with the grid size and bounds you want and getting the grid it creates.
What if the grid is not a cube, say its size is 20x50x70
? In that case, we can not use a GridKernel
because it assumes same size in all the dimensions. One can create 1D grid in each dimension and specify size and bounds for each one of them, but then it can not be wrapped with a ProductStructureKernel
. So, is it possible to do better than just using ProductKernel
?
A multidimensional grid kernel can be written as the Kronecker product of the one dimensional grids kernels.
One way to accomplish this would be to wrap the same 1D base kernel with three different GridKernels--one for each dimension. Then in your model forward method you can call all three grid kernels and create a KroneckerProductLazyTensor
from the three ToeplitzLazyTensors
you get from the grid kernels, and return that as the covariance matrix.
I'll refactor GridKernel to also support non-cube grids for convenience soon.
Thanks @jacobrgardner ! This is very helpful. It will be really awesome if GridKernel can support different lengths along each dimension..
Hey @jacobrgardner. If my test data is also a grid, is it possible to achieve some speedup instead of computing a train_size x test_size
kernel matrix ? This huge size matrix creates memory issues.
I'm not actually sure it is possible to do better if the test grid is different from the train grid, since the matrix may not necessarily be Toeplitz, which is where the space savings for the train x train matrix comes from.
I'll think about if there's structure there.
One idea would be to use the exact Grid matrix for the train train matrix but some other scalable GP approximate method for the test train matrix. Basically in GridKernel on that PR branch, where we call the base kernel if x1 != x2, you could instead call the base kernel wrapped in a GridInterpolationKernel or an InducingPointKernel.
I am studying a case where my data lies perfectly on a 3D grid (example: cubic lattice) and I want to use spectralmixture kernel for GP regression. I want to wrap it with GridKernel to speed up computation. What does the
inducing_points
argument does and is it same asgrid
argument if there is one data point corresponding to each grid cell, in other words, there are no empty cells in the grid dataset?