Open stevenstetzler opened 2 years ago
I don't think this is a bug in GPyTorch. It also seems like the issue is that your variance is collapsing to zero (minimum eigenvalue is similar to maximum eigenvalue). This would cause the numerical issues that you're seeing.
🐛 Bug
Using Deep Kernel Learning with the
InducingPointKernel
produces a non-positive-definite covariance matrix on test data (not the training data). The issue appears after the DNN weights + GP hyperparameters are trained on the training data. The matrix eventually becomes non-symmetric, and then at least one of the eigenvalues becomes negative. The asymmetry/negative eigenvalues are more negative/asymmetric than I would expect due to numerical stability issues at double precision (1e-7 - 1e-5, while the jitter added to make matrices PD during training is ~1e-8). The likelihood learns a noise term of ~1e-5 which is added to the diagonal of the test data covariance matrix. We additionally include a diagonal term for the observational noise which is ~1e-5, which we expect should fix any numerical instability issues when constructing the covariance matrix.Does this still seem to be due to numerical instabilities, or is there some property of the model or data that could cause an asymmetric/non-PD covariance matrix? Can we not expect to achieve a predictive variance smaller than 1e-5? Are there settings to tweak that could improve stability during prediction?
To reproduce
Code and data are available at: https://epyc.astro.washington.edu/~stevengs/gp/. Download the
.pkl
and.py
files to the same directory and runpython dkl_sgpr_example.py
.Here is the model:
Stack trace/error message
Here I print out training log, the model parameters, and the covariance matrix constructed for the test data at the beginning and after the covariance matrix becomes asymmetric and has negative eigenvalues.
This is also the contents of dkl_sgpr_example.log.
Expected Behavior
I expect the covariance matrix produced to be positive-definite within the numerical precision available.
System information
Please complete the following information:
python --version
: Python 3.9.12gpytorch.__version__
: 1.8.0torch.__version__
: 1.12.0+cu102cat /etc/os-release | grep "PRETTY_NAME"
: PRETTY_NAME="CentOS Linux 7 (Core)"