Open erikson84 opened 4 years ago
This has been open quite some time and not seeing any response from the dev team. I have also noticed this same issue, so as of now booster = gblinear
is not being set in the xgblinear
script which is referenced when calling method = xgblinear
. This results in method = xgblinear
defaulting to the gbtree
booster. See example below, both methods produce the exact same RMSE.
# create some fake test data
set.seed(1)
y <- rnorm(100,20,10)
x1 <- rnorm(100,50,9)
x2< - rnorm(100,200,64)
train_data <- cbind(y, x1, x2)
# 10 fold cv
train_stratified_control <- caret::trainControl(
method = "cv",
number = 10
)
############################ xgblinear ########################################
# defaults from xgboost manual for xgblinear
xgboost_linear_grid <- expand.grid(
nrounds = 100,
eta = 0.3,
alpha = 0,
lambda = 1
)
set.seed(1)
# specify solver as xgblinear
xgboost_linear_model <- caret::train(
y ~.,
data = train_data,
method = "xgbLinear",
trControl = train_stratified_control,
metric = "RMSE",
tuneGrid = xgboost_linear_grid,
verbose = FALSE
)
print('RMSE for caret::train XGBLinear')
xgboost_linear_model$results$RMSE
############################ xgbtree ########################################
# defualts from xgboost manual - tree based needs these
xgboost_tree_grid <- expand.grid(
eta = 0.3,
max_depth = 6,
nrounds = 100,
gamma = 0,
colsample_bytree = 1,
min_child_weight = 1,
subsample = 1
)
set.seed(1)
# specify solver as xgbtree
xgboost_tree_model <- caret::train(
y ~.,
data = train_data,
method = "xgbTree",
trControl = train_stratified_control,
metric = "RMSE",
tuneGrid = xgboost_tree_grid
)
print('RMSE for caret::train XGBTree')
mean(xgboost_tree_model$results$RMSE)
Just adding to this thread -- agree that this is indeed an outstanding issue in the fit
method of getModelInfo("xgbLinear")$xgbLinear
In the meantime, this can be resolved by simply passing the additional argument
booster='gblinear'
to caret::train()
and xgboost will pick it up in the parameters
When training a model using
method='xgbLinear'
,caret
does not set the proper parameter in XGBoost (booster='gblinear'
) and the resulting model is based on the regression tree base learner.Minimal, reproducible example:
Minimal dataset:
Minimal, runnable code:
Session Info: