Since upgrading GPy from 1.4 to 1.9 I'm getting much less effective hyper parameter optimisation. This often gives interpolation errors that are ~ factor of 10 larger in the newer versions of GPy. I've put a minimum working example below (including output from the two GPy versions below). I'd be very grateful for any help.

========Code======== import numpy as np import GPy

def f(x): return 0.1*np.sqrt(x)+1

def RMSE(model, grid, ygrid): mu, V = model.predict(grid) Sqerr = np.power(ygrid - mu,2) MSE = np.sum(Sqerr) rmse = np.sqrt(MSE/ygrid.size) print ('RMSE = ' +str(rmse)) print ('Log likelihood is ' + str(model.log_likelihood())) return rmse

Load in some test data

testingSize=1000 Xgrid = np.zeros(testingSize) ygrid = np.zeros(testingSize) for i in range(testingSize): Xgrid[i] = 1.0*i/testingSize ygrid[i] = f( Xgrid[i])

Load in some training data

trainingSize=100 X = np.zeros(trainingSize) y = np.zeros(trainingSize) for i in range(trainingSize): X[i] = 1.0*i/trainingSize y[i] = f( X[i]) y = y.reshape(len(y),1) X = X.reshape(len(X),1)

Train the GP and compute the root mean square error

knew = GPy.kern.ExpQuad(1, ARD=True) mnew = GPy.models.GPRegression(X, y, knew) mnew.optimize_restarts(10) print (mnew) RMSE(mnew,X, y)

========Output======== GPy 1.4.0 with Python 2.7.10 Optimization restart 1/10, f = -608.740335311 Optimization restart 2/10, f = -608.742055054 Optimization restart 3/10, f = -608.689427591 Optimization restart 4/10, f = -608.741007936 Optimization restart 5/10, f = -608.741447642 Optimization restart 6/10, f = -608.742522759 Optimization restart 7/10, f = -608.741474543 Optimization restart 8/10, f = -608.741280041 Optimization restart 9/10, f = -608.741647001 Optimization restart 10/10, f = -608.740959838

Name : GP regression Log-likelihood : 608.742522759 Number of Parameters : 3 Parameters: GP_regression. | Value | Constraint | Prior | Tied to ExpQuad.variance | 1.61738570306 | +ve | |
ExpQuad.lengthscale | 0.0541824078745 | +ve | |
Gaussian_noise.variance | 9.4190186883e-11 | +ve | |
RMSE = 7.05145634575e-06 Log likelihood is 608.742522759

GPy 1.9.6 with Python 2 or 3 Optimization restart 1/10, f = -600.667148107 Optimization restart 2/10, f = -600.667188436 Optimization restart 3/10, f = -600.667194655 Optimization restart 4/10, f = -600.66719339 Optimization restart 5/10, f = -600.667195027 Optimization restart 6/10, f = -600.667197124 Optimization restart 7/10, f = -600.667195882 Optimization restart 8/10, f = -600.667195138 Optimization restart 9/10, f = -600.667193562 Optimization restart 10/10, f = -600.667196754

Name : GP regression Objective : -600.667197124 Number of Parameters : 3 Number of Optimization Parameters : 3 Updates : True Parameters: GP_regression. | value | constraints | priors ExpQuad.variance | 1.1504980935370597 | +ve |
ExpQuad.lengthscale | 0.09527430198490645 | +ve |
Gaussian_noise.variance | 2.8363303666613326e-112 | +ve |
RMSE = 8.006162483973131e-05 Log likelihood is 600.6671971244359

Something went wrong with the code formatting so I've repasted it here.

import numpy as np import GPy

def f(x): return 0.1*np.sqrt(x)+1

Load in some test data

testingSize=1000 Xgrid = np.zeros(testingSize) ygrid = np.zeros(testingSize) for i in range(testingSize): Xgrid[i] = 1.0*i/testingSize ygrid[i] = f( Xgrid[i])

Load in some training data

trainingSize=100 X = np.zeros(trainingSize) y = np.zeros(trainingSize) for i in range(trainingSize): X[i] = 1.0*i/trainingSize y[i] = f( X[i]) y = y.reshape(len(y),1) X = X.reshape(len(X),1)

Train the GP and compute the root mean square error

knew = GPy.kern.ExpQuad(1, ARD=True) mnew = GPy.models.GPRegression(X, y, knew) mnew.optimize_restarts(10) print (mnew) RMSE(mnew,X, y)

SheffieldML / GPy

Hyper-parameter optimisation is less effective since upgrading #729

Load in some test data

Load in some training data

Train the GP and compute the root mean square error

Load in some test data

Load in some training data

Train the GP and compute the root mean square error