NIEHS / PrestoGP

Penalized Regression on Spatiotemporal Outcomes using Gaussian Processes a.k.a. PrestoGP
https://niehs.github.io/PrestoGP/
0 stars 0 forks source link

sparseNN produces incorrect nearest neighbors in some cases #38

Closed ericbair-sciome closed 5 months ago

ericbair-sciome commented 7 months ago

I noticed that the nearest neighbors produced by sparseNN were different from the nearest neighbors produced by GPvecchia in some cases. I wrote a new function of my own to find nearest neighbors. It gave the same output as GPvecchia. That suggests that the nearest neighbors produced by sparseNN are sometimes wrong.

kyle-messier commented 7 months ago

@ericbair-sciome @brian-bk22 @sciome-bot Does this need to be discussed? I remember there being some workaround in the GPvecchia to avoid randomness in determining the nearest neighbor set.

ericbair-sciome commented 7 months ago

No, this is a separate issue. I had noticed some small differences between the nearest neighbors produced by sparseNN and GPvecchia in the past. I assumed it was due to the artificial noise that GPvecchia adds to the data to avoid ties. But I did some more careful testing, and found that sparseNN simply misses some neighbors in unusual situations. Luckily, this was a rare bug that doesn't seem to meaningfully change any of our output. And I already fixed it anyway. @sciome-bot is reviewing my changes. I'll create a pull request (and close this bug) once the latest version is on GitHub.