NIEHS / PrestoGP

Penalized Regression on Spatiotemporal Outcomes using Gaussian Processes a.k.a. PrestoGP
https://niehs.github.io/PrestoGP/
0 stars 0 forks source link

Intercept not calculated when estimating yhat's #54

Open ericbair-sciome opened 3 months ago

ericbair-sciome commented 3 months ago

When the X's and y's are transformed to iid (using the transform_iid/transform_miid functions), the transformed y's will have mean 0, meaning that the intercept term in the glmnet model will always be (approximately) 0. However, if we use the same regression coefficients to calculate yhat using the original (untransformed) data, we need to add the Vecchia mean to the resulting yhat's to correct the fact that the transformed y's have mean 0. This currently is not being done.

ericbair-sciome commented 2 months ago

I think my comment above did not do a great job of explaining the issue. But the bottom line is that the old versions of PrestoGP had issues with multivariate models if the means of all outcomes were nonzero and different from one another. I made two changes to deal with this: 1) I changed the code so that, by default, all outcomes are mean centered before fitting the model. (Aside from the issue mentioned earlier, this also makes the model fitting procedure location invariant with respect to the y's, which seems like a good thing.) A user can override this option if they so choose. 2) For multivariate models, I added an additional dummy variable covariate for all outcomes but the first outcome. These two changes seem to fix this issue, so I will close it with the current release.