cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.54k stars 556 forks source link

Adding kernels in a GP Regression #319

Closed sadatnfs closed 5 years ago

sadatnfs commented 5 years ago

Hello,

I was wondering if it's possible to add (or multiply, which I guess the kronecker does for you) different kernels across different dimensions of my training data.

This is what I have tried to do (code below), but the fitting gives me a dimensionality error. I'm wondering if anyone else has implemented this already?

Basically: I am ultimately trying to build a model of the type : Y = b1X1 + b2X2 + GP across 3 separate dimensions.

And so, ultimately I plan to have linear kernels on for fitting b1 and b2, and then a kronecker of 3 kernels for the GP part in order to fit the 3 dimensions of data I'll have.

Any help is appreciated. Thanks!

n=100
train_x = torch.zeros(pow(n, 2), 3)
for i in range(n):
    for j in range(n):
        train_x[i * n + j][0] = float(i) / (n-1)
        train_x[i * n + j][1] = float(j) / (n-1)
        train_x[i * n + j][2] = float(j) / (n-1)

train_y = torch.sin((train_x[:, 0] + train_x[:, 1] + train_x[:,2]) * (2 * math.pi)) + torch.randn_like(train_x[:, 0]).mul(0.01)

class GPRegressionModel(gpytorch.models.ExactGP):

...

        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.GridInterpolationKernel(
            gpytorch.kernels.ScaleKernel(
                gpytorch.kernels.RBFKernel( active_dims=[0,1]) + \
                gpytorch.kernels.RBFKernel(  active_dims=[2]) 

            ), grid_size=grid_size, num_dims=3
        )
 ...

RuntimeError: KroneckerProductLazyTensor expects the same batch sizes for component tensors. Got sizes: [torch.Size([1, 21, 21]), torch.Size([1, 21, 21]), torch.Size([0, 21, 21])]
jacobrgardner commented 5 years ago

If you want fully additive structure (e.g., your function fully additively decomposes over the dimensions with one component per dimension), we recommend using AdditiveStructureKernel(GridInterpolationKernel(..., num_dims=1)). This basically tells us to decompose X additively and send each dimension on to a 1D GridInterpolationKernel.

In your example attempt to build a kernel, it looks like you want a 2D dimensional kernel over dimensions 0,1 and a 1D kernel over dimension 2. To accomplish this, you have roughly the right idea but you'll basically just want to be adding the GridInterpolationKernels themselves, not the inner RBF kernels. Something like:

covar_module_one = GridInterpolationKernel(ScaleKernel(RBFKernel()), grid_size=grid_size, active_dims=[0, 1], num_dims=2)
covar_module_two = GridInterpolationKernel(ScaleKernel(RBFKernel()), grid_size=grid_size, active_dims=[2], num_dims=1)
covar_module = covar_module_one + covar_module_two
sadatnfs commented 5 years ago

Oh that works beautifully thanks @jacobrgardner !

One last question: if I were to have a 3D problem where I have three dimensions of GP (let's say location, age and time), and I have 3 different types of kernels on them, is there a way to make a kronecker GP out of those 3 independent GPs? In your example which works, you've added the two independent kernels, and so I was thinking of a more expensive problem where I'd want a full kronecker of all the combos, instead of a linear addition..

jacobrgardner commented 5 years ago

Hi @sadatnfs if what you want really is kronecker structure, then you'll want to make KroneckerProductLazyTensors out of the outputs of the individual covariance modules. Something like:

self.covar_one = GridInterpolationKernel(..., active_dims=[0])
self.covar_two = GridInterpolationKernel(..., active_dims=[1])
self.covar_three = GridInterpolationKernel(..., active_dims=[3])

Then in forward:

covar_x = KroneckerProductLazyTensor(KroneckerProductLazyTensor(self.covar_one(x), self.covar_two(x)), self.covar_three(x))

Be aware that this will result in a (lazily represented) n^3 x n^3 kernel matrix, so you'll need (for example) n^3 labels to go along with this. Is this what you want? Or did you mean Hadamard product structure like with ProductKernel or ProductStructureKernel.

sadatnfs commented 5 years ago

Hmm I think I actually would want to have the kronecker of the three dimensions, because I'd like to be able to correlate all 3 of my dimensions. That definitely sounds like it'll get super expensive; I've only been able to implement that efficiently in a package in R which uses GMRF and sparse matrices to cut down on computation time, so I'm pretty interested in seeing how far I can push gpytorch to get equivalent results! What you've shown is more than enough for me to start digging!! Thanks!

chadrs2 commented 6 months ago

If you use the above code snippet that you provided @jacobrgardner, how would I use the standard inducing forward function to get my full Kuu because this isn't a function of the additivekernel? covar_module_one = GridInterpolationKernel(ScaleKernel(RBFKernel()), grid_size=grid_size, active_dims=[0, 1], num_dims=2) covar_module_two = GridInterpolationKernel(ScaleKernel(RBFKernel()), grid_size=grid_size, active_dims=[2], num_dims=1) covar_module = covar_module_one + covar_module_two

chadrs2 commented 6 months ago

This is what I currently have: kernel_t = gpytorch.kernels.GridInterpolationKernel( gpytorch.kernels.SpectralMixtureKernel(num_mixtures=3,ard_num_dims=1), grid_size=[args.num_temporal_inducing], grid_bounds=torch.tensor([[0-0.1, 1.1]]), num_dims=1, active_dims=[0] ) kernel_xy = gpytorch.kernels.GridInterpolationKernel( gpytorch.kernels.SpectralMixtureKernel(num_mixtures=3,ard_num_dims=2), grid_size=[ args.num_spatial_inducing, args.num_spatial_inducing ], grid_bounds=torch.tensor([ [0-0.1, 1.1], [0-0.1, 1.1], ]), num_dims=2, active_dims=[1,2] ) kernel = kernel_t + kernel_xy

And I want to easily extract the Kuu from this combined kernel for my 3-dimensional input.