Closed fabian-s closed 8 years ago
This is already a problem of mboost not just gamboostLSS
library(gamboostLSS)
data(cars)
gamboost(dist ~ 1, data = cars, dfbase = 4)
gamboostLSS(dist ~ 1, data = cars)
We should put it in the description (ideally in gamboost and gamboostLSS) and check if there is an easy fix for that in mboost, but we have to fix the problem there.
not sure I agree.
that issue never comes up for (non-pathological specifications of) mboost
models because why would ever want to boost an intercept model? However, intercept models do make sense for GAMLSS-type models because you may want to restrict the flexibility of additive predictors for higher order moments / nuisance parameters (to cut computation times, remain interpretable, etc).
Well, the point probably is that it usually doesn't make sense in mboost but it should be implemented there anyway as all the interfaces for model fitting are provided from mboost. Perhaps one should try to interpret 1
generally as intercept. Thus instead of
cars$int <- 1
gamboost(dist ~ bols(int, intercept = FALSE) + bols(..., intercept = FALSE), data = cars)
one could then write
gamboost(dist ~ bols(1, intercept = FALSE) + bols(..., intercept = FALSE), data = cars)
## or even better
gamboost(dist ~ 1 + bols(..., intercept = FALSE), data = cars)
1
should then always be defined as bols(rep(1, nrow(data), intercept = FALSE)
.
Do we agree that there is no realistic use case for a pure intercept base learner in mboost
, but that there is one for additive predictors in gamboostLSS
?
If so, I think it's a user interface / formula parsing issue for gamboostLSS (i.e., a ~1
formula should just add the missing columns ones
to the data and treat ~ 1
as ~ bols(ones, intercept = FALSE)
), not a missing feature in mboost
.
If not, when/why would I ever want to specify a naked intercept in mboost
and, if we make it easy to do so, how would we preempt user error & misunderstandings about the fact that every base learner updates its own intercept by default anyways?
Just to be clear:
I don't think we need to / should enable formulas ~ 1 + bols(bla) + bbbs(blub)
.
I do think having a shorthand for "this parameter is not affected by any covariates" via nuisance_param = response ~1
would be useful, as the default of recycling the first formula for all parameters of the distribution means that models get insanely complicated very quickly and specifying simplifications is a huge pain ATM (and not documented anywhere!)
@hofnerb @ja-thomas just re-read your comments, you're right of course, I was being a Gscheithaferl :smirk: Closing this and migrating it to mboost.
I think having to do
instead of
sucks major d* in terms of usability. At least it should be documented** somewhere that this is the way we want users to specify an intercept model / formula.....