Open RichGraham opened 5 years ago
Something went wrong with the code formatting so I've repasted it here.
import numpy as np import GPy
def f(x): return 0.1*np.sqrt(x)+1
def RMSE(model, grid, ygrid): mu, V = model.predict(grid) Sqerr = np.power(ygrid - mu,2) MSE = np.sum(Sqerr) rmse = np.sqrt(MSE/ygrid.size) print ('RMSE = ' +str(rmse)) print ('Log likelihood is ' + str(model.log_likelihood())) return rmse
testingSize=1000 Xgrid = np.zeros(testingSize) ygrid = np.zeros(testingSize) for i in range(testingSize): Xgrid[i] = 1.0*i/testingSize ygrid[i] = f( Xgrid[i])
trainingSize=100 X = np.zeros(trainingSize) y = np.zeros(trainingSize) for i in range(trainingSize): X[i] = 1.0*i/trainingSize y[i] = f( X[i]) y = y.reshape(len(y),1) X = X.reshape(len(X),1)
knew = GPy.kern.ExpQuad(1, ARD=True) mnew = GPy.models.GPRegression(X, y, knew) mnew.optimize_restarts(10) print (mnew) RMSE(mnew,X, y)
Since upgrading GPy from 1.4 to 1.9 I'm getting much less effective hyper parameter optimisation. This often gives interpolation errors that are ~ factor of 10 larger in the newer versions of GPy. I've put a minimum working example below (including output from the two GPy versions below). I'd be very grateful for any help.
========Code======== import numpy as np import GPy
def f(x): return 0.1*np.sqrt(x)+1
def RMSE(model, grid, ygrid): mu, V = model.predict(grid) Sqerr = np.power(ygrid - mu,2) MSE = np.sum(Sqerr) rmse = np.sqrt(MSE/ygrid.size) print ('RMSE = ' +str(rmse)) print ('Log likelihood is ' + str(model.log_likelihood())) return rmse
Load in some test data
testingSize=1000 Xgrid = np.zeros(testingSize) ygrid = np.zeros(testingSize) for i in range(testingSize): Xgrid[i] = 1.0*i/testingSize ygrid[i] = f( Xgrid[i])
Load in some training data
trainingSize=100 X = np.zeros(trainingSize) y = np.zeros(trainingSize) for i in range(trainingSize): X[i] = 1.0*i/trainingSize y[i] = f( X[i]) y = y.reshape(len(y),1) X = X.reshape(len(X),1)
Train the GP and compute the root mean square error
knew = GPy.kern.ExpQuad(1, ARD=True) mnew = GPy.models.GPRegression(X, y, knew) mnew.optimize_restarts(10) print (mnew) RMSE(mnew,X, y)
========Output======== GPy 1.4.0 with Python 2.7.10 Optimization restart 1/10, f = -608.740335311 Optimization restart 2/10, f = -608.742055054 Optimization restart 3/10, f = -608.689427591 Optimization restart 4/10, f = -608.741007936 Optimization restart 5/10, f = -608.741447642 Optimization restart 6/10, f = -608.742522759 Optimization restart 7/10, f = -608.741474543 Optimization restart 8/10, f = -608.741280041 Optimization restart 9/10, f = -608.741647001 Optimization restart 10/10, f = -608.740959838
Name : GP regression Log-likelihood : 608.742522759 Number of Parameters : 3 Parameters: GP_regression. | Value | Constraint | Prior | Tied to ExpQuad.variance | 1.61738570306 | +ve | |
ExpQuad.lengthscale | 0.0541824078745 | +ve | |
Gaussian_noise.variance | 9.4190186883e-11 | +ve | |
RMSE = 7.05145634575e-06 Log likelihood is 608.742522759
GPy 1.9.6 with Python 2 or 3 Optimization restart 1/10, f = -600.667148107 Optimization restart 2/10, f = -600.667188436 Optimization restart 3/10, f = -600.667194655 Optimization restart 4/10, f = -600.66719339 Optimization restart 5/10, f = -600.667195027 Optimization restart 6/10, f = -600.667197124 Optimization restart 7/10, f = -600.667195882 Optimization restart 8/10, f = -600.667195138 Optimization restart 9/10, f = -600.667193562 Optimization restart 10/10, f = -600.667196754
Name : GP regression Objective : -600.667197124 Number of Parameters : 3 Number of Optimization Parameters : 3 Updates : True Parameters: GP_regression. | value | constraints | priors ExpQuad.variance | 1.1504980935370597 | +ve |
ExpQuad.lengthscale | 0.09527430198490645 | +ve |
Gaussian_noise.variance | 2.8363303666613326e-112 | +ve |
RMSE = 8.006162483973131e-05 Log likelihood is 600.6671971244359