Closed ggruenhagen3 closed 10 months ago
Thank you for for the suggestion!
I will add this. The way I understand this, an offset is just a sample-specific constant that you add to the linear predictor (= sum of fixed and random effects). Correct me if I am wrong. lme4 also has the option that "One or more offest terms can be included in the formula", but I guess one offset is enough as a user can calculate the sum before passing this. I.e., an offset will be a vector of length the number of data points.
Awesome, thank you! Yes, that is my understanding too.
Hi @fabsig, is it looking like adding this feature is possible? Or have you run into roadblocks? Thank you so much and have a great day! 😃
Lots of other work... Will add it soon (hopefully within 1-2 weeks). Thanks for your patience.
The offset feature is now implemented and on GitHub (not yet on CRAN). You can pass an offset via the fixed_effects
parameter of the fit
function of GLMMs. For instance in R:
gp_model <- fitGPModel(group_data = group, likelihood = "bernoulli_probit",
y = y, X = X, fixed_effects = offset)
The only caveat is that, currently, if you call the predict
function , you need to pass the same (training data) offset again, e.g.:
pred <- predict(gp_model, group_data_pred = group_test, X_pred = X_test,
predict_response = FALSE, fixed_effects = offset)
This is, admittedly, not ideal from a user experience point of view, and it could be fixed. But since this feature is likely used only relatively rarely and my time is limited, I currently leave it like that.
If you want to use an offset in the GPBoost algorithm, I suggest you use the init_score
argument:
dtrain <- lgb.Dataset(X, label = label)
set_field(dtrain, "init_score", offset)
The latter is functionality that is inherited from LightGBM (I have not tested it).
Is it possible to support an offset parameter like you can for
lme4::glmer
? This is an important parameter for differential gene expression analysis, where an offset is used for the known size factors. It would be really great if this could be done ingpboost
.An example of the offset parameter in lme4::glmer:
lme4::glmer(count ~ cond + (1|subject), data = df, offset = log(size_factors))
orlme4::glmer(count ~ cond + offset( log(size_factors)) + (1|subject), data = df)