cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.53k stars 556 forks source link

vanilla GP for +3500 points [Docs] #1184

Open RodrigoAVargasHdz opened 4 years ago

RodrigoAVargasHdz commented 4 years ago

📚 Documentation/Examples

Hi, I am trying to train a GP model with approx 5000 points with 6 features using the RBF kernel, plain vanilla GP. My code is based on the tutorial online.

class ExactGPModel(gpytorch.models.ExactGP): def init(self, train_x, train_y, likelihood,d): super(ExactGPModel, self).init(train_x, train_y, likelihood) self.mean_module = gpytorch.means.ConstantMean() self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel(ard_num_dims=d))

def forward(self, x):
    mean_x = self.mean_module(x)
    covar_x = self.covar_module(x)
    return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

training

loss0 = 1E6
training_iter = 2501
for j in range(5):

initialize likelihood and model

    likelihood = gpytorch.likelihoods.GaussianLikelihood()
    model = ExactGPModel(train_x, train_y, likelihood,d)

Find optimal model hyperparameters

    model.train()
    likelihood.train()

Use the adam optimizer

    optimizer = torch.optim.Adam([
        {'params': model.parameters()},  # Includes GaussianLikelihood parameters
    ], lr=0.1)

    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1500, gamma=0.5)

"Loss" for GPs - the marginal log likelihood

    mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)   

    for i in range(training_iter):
        optimizer.zero_grad()
        output = model(train_x)
        loss = -mll(output, train_y)
        loss.backward()
        optimizer.step()
        scheduler.step()            
        if (i % 25) == 0:
            print('(%i - %i), Loss: %.8f'%(j,i,loss0))

During the training of the GP, I get the following Error,

File "main_GP_ymax.py", line 146, in exact_GP_rbf loss = -mll(output, train_y) File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/module.py", line 28, in call outputs = self.forward(*inputs, *kwargs) File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/mlls/exact_marginal_log_likelihood.py", line 51, in forward res = output.log_prob(target) File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/distributions/multivariate_normal.py", line 135, in log_prob inv_quad, logdet = covar.inv_quad_logdet(inv_quad_rhs=diff.unsqueeze(-1), logdet=True) File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/lazy/lazy_tensor.py", line 1051, in inv_quad_logdet args, File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/functions/_inv_quad_log_det.py", line 161, in forward solves, t_mat = lazy_tsr._solve(rhs, preconditioner, num_tridiag=num_random_probes) File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/lazy/lazy_tensor.py", line 650, in _solve preconditioner=preconditioner, File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/utils/linear_cg.py", line 175, in linear_cg residual = rhs - matmul_closure(initial_guess) File "/cptg/u4/rvargas/.local/lib/python3.7/site-packages/gpytorch/lazy/added_diag_lazy_tensor.py", line 53, in _matmul return torch.addcmul(self._lazy_tensor._matmul(rhs), self._diag_tensor._diag.unsqueeze(-1), rhs) RuntimeError: expected dtype Double but got dtype Float

However, if I reduce the number of points (approx 2000) my code seems to work fine. Can you help me with what changes I need to do in order to train a vanilla GP with "large" number of points?

Thanks!

jacobrgardner commented 4 years ago

Just hazarding a guess here, but are you converting your train_x and train_y from NumPy? If so, you should know that torch and numpy have different default dtypes: numpy defaults to float64 and torch defaults to float32. You'll need to either convert your data to float32 (e.g., train_x = train_x.float() and train_y = train_y.float()) or (less likely) your model to double (model.double()).