boost-R / gamboostLSS

Boosting models for fitting generalized additive models for location, shape and scale (GAMLSS) to potentially high dimensional data. The current relase version can be found on CRAN (https://cran.r-project.org/package=gamboostLSS).
26 stars 11 forks source link

Model reduction broken #8

Closed ja-thomas closed 8 years ago

ja-thomas commented 8 years ago

Model reduction yields different coefficients if the model is set back to the earlier value.

MWE:

require("gamboostLSS")

###negbin dist, linear###

set.seed(2611)
x1 <- rnorm(1000)
x2 <- rnorm(1000)
x3 <- rnorm(1000)
x4 <- rnorm(1000)
x5 <- rnorm(1000)
x6 <- rnorm(1000)
mu    <- exp(1.5 + x1^2 +0.5 * x2 - 3 * sin(x3) -1 * x4)
sigma <- exp(-0.2 * x4 +0.2 * x5 +0.4 * x6)
y <- numeric(1000)
for (i in 1:1000)
  y[i] <- rnbinom(1, size = sigma[i], mu = mu[i])
dat <- data.frame(x1, x2, x3, x4, x5, x6, y)

for (i in 40:100) { 
  mod <- gamboostLSS(y ~ . , data = dat, families = NBinomialLSS(),
                     control = boost_control(mstop = i, trace  = FALSE), 
                     method = "outer", baselearner = "bols")

  cm <- coef(mod)

  mstop(mod) <- 5
  mstop(mod) <- i

  cat(i, all.equal.list(coef(mod), cm), "\n")
}
ja-thomas commented 8 years ago

More code

mod <- glmboostLSS(y ~ . , data = dat, families = NBinomialLSS(),
                   control = boost_control(mstop = 100, trace  = FALSE), 
                   method = "outer")

mod2 <- glmboostLSS(y ~ . , data = dat, families = NBinomialLSS(),
                   control = boost_control(mstop = 100, trace  = FALSE), 
                   method = "outer")

mstop(mod) <- 10
mstop(mod) <- 100

all.equal.list(coef(mod), coef(mod2))
all.equal.environment(environment(mod$mu$subset), environment(mod2$mu$subset), check.attributes = FALSE)
all.equal.environment(environment(mod$sigma$subset), environment(mod2$sigma$subset), check.attributes = FALSE)
ja-thomas commented 8 years ago

Possible issue:

No issue if we use a start model with one or more sigma updates and don't go below the point were sigma is updated or if we use a start model without sigma and reduce the model.