boost-R / gamboostLSS

Boosting models for fitting generalized additive models for location, shape and scale (GAMLSS) to potentially high dimensional data. The current relase version can be found on CRAN (https://cran.r-project.org/package=gamboostLSS).
26 stars 11 forks source link

stabsel.mboostLSS does not work on FDboost with scalar response #51

Closed fabian-s closed 5 years ago

fabian-s commented 5 years ago

because this check:

https://github.com/boost-R/gamboostLSS/blob/f97c8906fc9d78e05cb28823789b12cd31c06fdc/R/methods.R#L399

assumes a functional response model, I think.

sbrockhaus commented 5 years ago

To give it a try, I fitted a model with scalar response with FDboost and used stabsel() on it and it worked fine:

library(FDboost)

## simulate a small data set 
set.seed(230)
dat <- list()
dat$X1 <- scale(matrix(rnorm(25*21), nrow = 25), scale = FALSE)
dat$X2 <- scale(matrix(rnorm(25*21), nrow = 25), scale = FALSE)
dat$X3 <- scale(matrix(rnorm(25*21), nrow = 25), scale = FALSE)
dat$X4 <- scale(matrix(rnorm(25*21), nrow = 25), scale = FALSE)
dat$X5 <- scale(matrix(rnorm(25*21), nrow = 25), scale = FALSE)

dat$svals <- seq(0, 1, l = 21)

dat$Y_scalar <- 2 + 0.1 * rowSums(dat$X1) + 0.01 * rowSums(dat$X2)

## fit a FDboost model with scalar response and 6 base-learners 
mod <- FDboost(Y_scalar ~ 1 
               + bsignal(X1, svals, knots = 6, df = 3)
               + bsignal(X2, svals, knots = 6, df = 3) 
               + bsignal(X3, svals, knots = 6, df = 3)
               + bsignal(X4, svals, knots = 6, df = 3)
               + bsignal(X5, svals, knots = 6, df = 3), 
               timeformula = NULL, 
               control = boost_control(mstop = 50), data = dat)
class(mod)

## prepare for stability selection 

### use PFER = 0.1*(number of base-learners)
### use cutoff = 0.9 (rather high cutoff)
cat("cutoff:", 0.9)
cat(" PFER:", 0.1*length(mod$baselearner), "\n")
print(stabsel_parameters(p = length(mod$baselearner), 
                         cutoff = 0.9, PFER = 0.1*length(mod$baselearner)))

# do stability selection
set.seed(2581)  
stab1 <- stabsel(mod, cutoff=0.9, PFER=0.1*length(mod$baselearner), 
                 sampling.type="SS", mc.cores=1)
sbrockhaus commented 5 years ago

Ah, sorry, I misunderstood. You talk about models that are fitted with FDboostLSS() not about models that are fitted with FDboost(). I will have a look...

sbrockhaus commented 5 years ago

As you wrote, the problem is, that the scalar response is treated as a functional response and thus stabsel() does not work. I did some further testing and realized that for functional response, in the current version you have to provide the folds argument, otherwise stabsel fails also for functional response. I fixed stabsel() for FDboostLSS models with scalar and functional responses. After finishing some checks, I will upload the adapted code.