cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.53k stars 554 forks source link

[QUESTION]how to use gpytorch achieve same effect as GPy? #988

Open Junnian opened 4 years ago

Junnian commented 4 years ago

question

I want to use GPR interplotation ,there are two python pakage to use ,gpytorch and GPy. the GPy can automatically optimizes parameters ,so I can get a good interplotated image used GPy, but its speed is slow. I want to use gpytorch to do this faster, however I can`t find a good parameters manually,is this normal?How should I do?

example:

data

the train data: contour

the ture image: image

the GPy interplotation result: image

the gpytorch interplotation result: image

code

# gpr
import GPy
kernel_SE = GPy.kern.RBF(ARD=True,input_dim=2,variance=1,lengthscale=1)
model = GPy.models.GPRegression(train_x,train_y,kernel)
model.optimize()
GPy_interplotation_result , _ = model.predict(test_x)
'''
keranl learned:  {'input_dim': 2, 'active_dims': [0, 1], 'name': 'rbf', 'useGPU': 0, 'variance': [236.47126432664737], 'lengthscale': [5.1380968669556495, 5.992828608263263], 'ARD': True, 'class': 'GPy.kern.RBF', 'inv_l': False}
model inference use 59.3389151096344
'''

# gpytorch
import gpytorch
class GridGPRegressionModel1(gpytorch.models.ExactGP):
    def __init__(self, grid, train_x, train_y, likelihood):
        super(GridGPRegressionModel1, self).__init__(train_x, train_y, likelihood)
        num_dims = train_x.size(-1)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.GridKernel(
            gpytorch.kernels.ScaleKernel(
                gpytorch.kernels.RBFKernel(ard_num_dim = 2,active_dims = [0,1])
            ), 
            grid=grid)

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

import time
def train(likelihood,model,train_x,train_y,epochs):
    model.train()
    likelihood.train()

    # Use the adam optimizer
    optimizer = torch.optim.Adam([
        {'params': model.parameters()},
    ], lr=0.1)

    mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
    s = time.time()
    # training_iter = 20
    for epoch in range(epochs):
        optimizer.zero_grad()
        output = model(train_x)
        loss = -mll(output, train_y)
        loss.backward()
        print('Iter %d/%d - Loss: %.3f - lengscale: %.4f - learn_rate : %.4f' % 
              (epoch + 1, epochs, loss.item(),
            model.covar_module.base_kernel.base_kernel.lengthscale[0][0].item(),optimizer.param_groups[0]['lr']
              )
             )
        optimizer.step()
    e = time.time()
    return likelihood,model

likelihood = gpytorch.likelihoods.GaussianLikelihood().cuda()
model = GridGPRegressionModel2(grid, train_x, train_y, likelihood).cuda()

likelihood,model = train(likelihood,model,train_x,train_y,epochs=100)

with torch.no_grad(), gpytorch.settings.fast_pred_var():
    gpytorch_interplotation_result  = likelihood(model(test_x.cuda()))

'''
Iter 1/100 - Loss: 234.869 - lengscale: 0.6931 - learn_rate : 0.1000
Iter 2/100 - Loss: 193.810 - lengscale: 0.7444 - learn_rate : 0.1000
Iter 3/100 - Loss: 159.671 - lengscale: 0.7978 - learn_rate : 0.1000
Iter 4/100 - Loss: 131.618 - lengscale: 0.8530 - learn_rate : 0.1000
Iter 5/100 - Loss: 108.776 - lengscale: 0.9096 - learn_rate : 0.1000
Iter 6/100 - Loss: 90.293 - lengscale: 0.9671 - learn_rate : 0.1000
Iter 7/100 - Loss: 75.402 - lengscale: 1.0251 - learn_rate : 0.1000
Iter 8/100 - Loss: 63.422 - lengscale: 1.0831 - learn_rate : 0.1000
Iter 9/100 - Loss: 53.782 - lengscale: 1.1407 - learn_rate : 0.1000
.......
Iter 90/100 - Loss: 5.189 - lengscale: 2.5188 - learn_rate : 0.1000
Iter 91/100 - Loss: 5.167 - lengscale: 2.5235 - learn_rate : 0.1000
Iter 92/100 - Loss: 5.147 - lengscale: 2.5283 - learn_rate : 0.1000
Iter 93/100 - Loss: 5.125 - lengscale: 2.5329 - learn_rate : 0.1000
Iter 94/100 - Loss: 5.105 - lengscale: 2.5376 - learn_rate : 0.1000
Iter 95/100 - Loss: 5.084 - lengscale: 2.5422 - learn_rate : 0.1000
Iter 96/100 - Loss: 5.063 - lengscale: 2.5468 - learn_rate : 0.1000
Iter 97/100 - Loss: 5.044 - lengscale: 2.5514 - learn_rate : 0.1000
Iter 98/100 - Loss: 5.030 - lengscale: 2.5559 - learn_rate : 0.1000
Iter 99/100 - Loss: 5.008 - lengscale: 2.5604 - learn_rate : 0.1000
Iter 100/100 - Loss: 4.980 - lengscale: 2.5649 - learn_rate : 0.1000
'''
jacobrgardner commented 4 years ago

This would be pretty hard to look in to without you uploading the actual sample data there, so it's hard to do more than guess. However, a few things:

  1. Try wrapping the GridKernel in the ScaleKernel, rather than the other way around.
  2. It doesn't look like you are normalizing your data, which is usually a problem for gpytorch. Try either normalizing your data, or adding a small BatchNorm1d "feature extractor" to your model.
  3. To rule out numerics issues with the grid kernel, try using tighter CG tolerance. Add a with gpytorch.settings.cg_tolerance(0.0001): around your likelihood,model = train(...) call, and a with gpytorch.settings.eval_cg_tolerance(0.0001): to your prediction settings.
Junnian commented 4 years ago

This would be pretty hard to look in to without you uploading the actual sample data there, so it's hard to do more than guess. However, a few things:

  1. Try wrapping the GridKernel in the ScaleKernel, rather than the other way around.
  2. It doesn't look like you are normalizing your data, which is usually a problem for gpytorch. Try either normalizing your data, or adding a small BatchNorm1d "feature extractor" to your model.
  3. To rule out numerics issues with the grid kernel, try using tighter CG tolerance. Add a with gpytorch.settings.cg_tolerance(0.0001): around your likelihood,model = train(...) call, and a with gpytorch.settings.eval_cg_tolerance(0.0001): to your prediction settings.

thanks for your replay. please allow me to show my sample data

the train_data is a image, I extractor its Coordinate to make train_x,and the corresponding value is train_y. for example:

train_data is

array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

so the train_x is train_data`s indices

 array([[0, 0],
       [0, 1],
       [0, 2],
       [1, 0],
       [1, 1],
       [1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])

and the train_y is train_data`s value

array([1, 2, 3, 1, 2, 3, 1, 2, 3])

thanks for you again,I really did`nt have normal the data,I will try it