cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.53k stars 556 forks source link

Scalable SM Kernel [Feature Request] #878

Open wjmaddox opened 4 years ago

wjmaddox commented 4 years ago

🚀 Feature Request

This is somewhere between a bug and feature request. Attempting to use spectral mixture kernels on reasonably sized data (100 x 100 grids) but it OOMs at test time. It is possibly caching related.

Feature request would probably be KeOps implementation for SM kernels or a more faithful implementation of Kronecker based inference for GPs .

Motivation

There is an explicit matrix being formed in the kernel's forwards pass here which is where memory issue is occurring.

Pitch

I (or @g-benton) am willing to open a PR but it may take a while - we just want a comparison to #872 .

Minimal Working Example


import torch 
import gpytorch
import math

if torch.cuda.is_available():
    torch.set_default_tensor_type(torch.cuda.FloatTensor)

# creat training grid
grid_bounds = [(0, 1), (0, 1)]
grid_size = 70
grid = torch.zeros(grid_size, len(grid_bounds))
for i in range(len(grid_bounds)):
    grid_diff = float(grid_bounds[i][1] - grid_bounds[i][0]) / (grid_size - 2)
    grid[:, i] = torch.linspace(grid_bounds[i][0] - grid_diff, grid_bounds[i][1] + grid_diff, grid_size)

train_x = gpytorch.utils.grid.create_data_from_grid(grid)
train_y = torch.sin((train_x[:, 0] + train_x[:, 1]) * (2 * math.pi)) + torch.randn_like(train_x[:, 0]).mul(0.01)

# setup model
class GridSM(gpytorch.models.ExactGP):
    def __init__(self, grid, train_x, train_y, likelihood):
        super(GridSM, self).__init__(train_x, train_y, likelihood)
        num_dims = train_x.size(-1)
        self.mean_module = gpytorch.means.ConstantMean()
        self.base = gpytorch.kernels.SpectralMixtureKernel(num_mixtures=20, ard_num_dims=2)
        self.base.initialize_from_data(train_x, train_y)
        self.covar_module = gpytorch.kernels.GridKernel(self.base, grid=grid)
        #self.covar_module = self.base

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = GridSM(grid, train_x, train_y, likelihood)
test_x = model(train_x)

# setup testing grid
grid_bounds = [(1, 2), (0, 1)]
grid_size = 50
test_grid = torch.zeros(grid_size, len(grid_bounds))
for i in range(len(grid_bounds)):
    grid_diff = float(grid_bounds[i][1] - grid_bounds[i][0]) / (grid_size - 2)
    test_grid[:, i] = torch.linspace(grid_bounds[i][0] - grid_diff, grid_bounds[i][1] + grid_diff, grid_size)

test_x = gpytorch.utils.grid.create_data_from_grid(test_grid)

# now evaluate
model.eval()
with gpytorch.settings.fast_pred_var(True), gpytorch.settings.skip_posterior_variances(True):
    predictive_dist = model(test_x)
gpleiss commented 4 years ago

Honestly, a KeOps implementation would probably be the way to go.

ginward commented 3 years ago

Is Spectral Mixture Kernel more memory consuming? It seems that I will incur an out-of-memory warning with about 1000 data points and 54 features, where a RBF kernel works just fine.

wjmaddox commented 3 years ago

Can you explain in a bit more detail?

It doesn't look like the proximal issue of an explicit matrix being formed in the forwards pass has been resolved (still here. However, the above code now runs on my GPU because of the improvements made to KroneckerProductLazyTensor over the past year or so (I should have probably closed the issue before).

anjawa commented 2 years ago

Has a PR already been opened? If help is needed, I could help out.