Open Gengfu-He opened 1 year ago
Mathematically I think we're doing what you've wrote, but we implement it with Cholesky factorization, so instead of
mean_predict.analytical result = kernel_cov.T.dot(Kff_inv).dot(train_ys)
we do something like
import jax.scipy as sp
c, _ = sp.linalg.cho_factor(kernel_train + noise_scale * noise_scale * np.mean(np.trace(kernel_train)) * np.eye(len(train_xs)))
Kff_inv_dot_train_ys = sp.linalg.cho_solve(c, train_ys)
mean_predict.analytical result = kernel_cov.T.dot(Kff_inv_dot_train_ys)
This could give slightly different results from np.linalg.inv
, but is faster. Could this explain the difference, or you get huge discrepancy?
kernel_train = kernel_fn(train_xs, train_xs, 'nngp')
kernel_cov= kernel_fn(train_xs, test_xs, 'nngp')
Kff_inv = np.linalg.inv(kernel_train + noise_scale noise_scale np.mean(np.trace(kernel_train)) * np.eye(len(train_xs)))
mean_predict.analytical result = kernel_cov.T.dot(Kff_inv).dot(train_ys)
I have recently found that the analytical results above can not agree well with the predictions of nt.predict.gp_inference as follows:
predict_fn = nt.predict.gp_inference(kernel_train, train_ys, diag_reg=noise_scale*noise_scale) k_test_test = kernel_fn(test_xs, None, 'nngp') mean_predict.NNGP , covariance = predict_fn('nngp', kernel_cov.T, k_test_test)
What is the problem? I am not sure if the trace_axes part has some influence?