NIEHS / PrestoGP

Penalized Regression on Spatiotemporal Outcomes using Gaussian Processes a.k.a. PrestoGP
https://niehs.github.io/PrestoGP/
0 stars 0 forks source link

Duplicated locs can cause numeric issues #47

Closed ericbair-sciome closed 4 months ago

ericbair-sciome commented 5 months ago

During some recent testing, I noticed that the GP likelihood optimization procedure seems to run into problems when there are duplicate locs (which is likely to be a common occurrence in multivariate models if two outcomes are measured at the same location). I kept getting warnings that the matrices were numerically singular, and usually the procedure would eventually crash. The obvious solution seems to be to add a small amount of noise to the locs to ensure that there are no duplicates. I tried that and it seems to fix the issue. We are already doing this when picking the nearest neighbors (and GPvecchia does the same thing in that case). I am inclined to simply add some noise to the locs at the beginning of vecchia_Mspecify and simply replace the original locs with the "noised up" version for the rest of the model fitting procedure. Let me know if anyone sees a problem with that approach.

kyle-messier commented 5 months ago

@ericbair-sciome Indeed there are likely to be duplicate x,y locs with multivariate data. Seems like a simple and easy fix to me. Thanks!

ericbair-sciome commented 4 months ago

This is fixed in the latest release.