Support mgcv::gam and scam models?

tomwenseleers commented 6 years ago

Would it be possible by any chance for investr to also support scam (https://cran.r-project.org/web/packages/scam/index.html) and mgcv::gam (https://cran.r-project.org/web/packages/mgcv/index.html) models? I noticed that a constrained monotone scam spline works very nicely for me to do calibration curves, but then of course I would also need SEs on the inverse predictions.

Maybe relevant for calculating confidence intervals: https://stats.stackexchange.com/questions/33327/confidence-interval-for-gam-model/33328

For prediction intervals for GAMs (scam inherits from gam so recipe for both should be the same) I found this on public threads: https://stat.ethz.ch/pipermail/r-help/2011-April/275632.html https://stackoverflow.com/questions/18909234/coding-a-prediction-interval-from-a-generalized-additive-model-with-a-very-large

bgreenwell commented 6 years ago

@tomwenseleers I actually wrote some code for someone else a little while back for SCAMs. I'll try to find it and send to you so that you can get started and then hopefully push an update to the package in the next couple weeks. Initially, many years ago, it was my intention to extend the package to support semiparametric models with monotonicity constraints, so this is good motivation to get it done!

tomwenseleers commented 6 years ago

Ha that's great - yes please send it over (you can send it to tom.wenseleers@kuleuven.be)! My scam models also use the weights argument btw to take into account unequal variances...

tomwenseleers commented 6 years ago

Just realised that the scam::predict and mgcv:predict functions also return the SE on the predictions based on the posterior distribution of the parameters if you set se.fit=TRUE - maybe the easiest starting point to calculate confidence intervals?

jphill01 commented 6 years ago

@tomwenseleers I believe the person @bgreenwell speaks of is me. We've been corresponding on my interest in employing inverse estimation for GAMs/SCAMs, though it hasn't been without challenges.

investr employs the bisection method in computing the inverse estimate. For my problem, bisection doesn't appear to suffice, due to issue in computing the inverse estimate along with confidence intervals. So, I have decided to use bootstrapping along with GAM/SCAM code for inverse estimation in order to compute estimates of bias and standard error. But, bootstrapping semiparametric models is VERY slow in my case without parallelization. A promising alternative from the literature (according to Simon Wood, the creator of mgcv) seems to be posterior simulation from a Multivariate Normal distribution.

tomwenseleers commented 6 years ago

Was wondering if the inbuilt scam or gam se.fit=TRUE in predict together with using the bisection method for the inverse estimate wouldn't be good enough though? Something like

model.fit = some scam or gam model
preds = data.frame(x = x, predict(model.fit, newdata=list(x = x), 
                                         se=TRUE, weights = weights))
  k <- 2*qt(p=0.95, df=model.fit$df.residual, lower.tail = TRUE) 
  confint <- k*preds$se.fit
  k <- qt(p=0.975, df=model.fit$df.residual)
  predint <- k*sqrt(preds$se.fit^2+weights*model.fit$sig2) 
  preds <- data.frame(x = preds$x, 
                     fit=preds$fit, 
                     confint.lwr=preds$fit-confint, 
                     confint.upr=preds$fit+confint,
                     predint.lwr=preds$fit-predint,
                     predint.upr=preds$fit+predint)

And then just use the bisection method for the inverse estimation?

bgreenwell commented 6 years ago

For scams, the bisection method will work just fine since it forces the fitted mean response to be either monotonically increasing or decreasing. The issue with the bisection method comes when you are trying to bootstrap or the fit is not monotonic over the predictor range of interest (same issue with asymptotes as well). I'll put something together soon!

bgreenwell / investr

Support mgcv::gam and scam models? #37