Closed fredho-42 closed 8 months ago
Hi @fredho-42 and sorry for the late reply.
Thanks for the issue, however, this has like nothing to do with mlr3mbo
but is either an issue with the "surv.xgboost"
Learner (now in mlr3extralearners
) or with the AutoTuner
in mlr3tuning
and early stopping in general.
To see this, maybe try replacing tnr("mbo")
with tnr("random_search")
:
library(mlr3)
library(mlr3extralearners)
library(mlr3pipelines)
library(mlr3tuning)
library(mlr3proba)
library(survival)
# Less logging
lgr::get_logger("bbotk")$set_threshold("warn")
lgr::get_logger("mlr3")$set_threshold("warn")
set.seed(42)
train_indxs = sample(seq_len(nrow(veteran)), 100)
task = as_task_surv(x = veteran, time = "time", event = "status")
poe = po("encode")
task = poe$train(list(task))[[1]]
task
ncores = 4
learner = lrn("surv.xgboost",
nthread = ncores, booster = "gbtree", early_stopping_rounds = 10,
nrounds = to_tune(50, 1000),
eta = to_tune(p_dbl(1e-04, 1, logscale = TRUE)),
max_depth = to_tune(2, 10))
# Random Search
xgboost_at_rs = AutoTuner$new(
learner = learner,
resampling = rsmp("cv", folds = 5),
measure = msr("surv.cindex"),
terminator = trm("evals", n_evals = 30),
tuner = tnr("random_search")
)
xgboost_at_rs$train(task, row_ids = train_indxs)
Tuning with early stopping usually requires you to set the appropriate callback, see https://mlr-org.com/gallery/optimization/2022-11-04-early-stopping-with-xgboost/ for an example. (also, If you early stop during tuning, you likely do not want to tune the number of boosting iterations but simply set them to a very high value).
For example tuning XGBoost on the iris task with early stopping via an autotuner would look like this:
library(mlr3)
library(mlr3learners)
library(mlr3tuning)
set.seed(42)
task = tsk("iris")
learner = lrn("classif.xgboost",
booster = "gbtree", early_stopping_rounds = 10,
nrounds = 1000,
eta = to_tune(p_dbl(1e-04, 1, logscale = TRUE)),
max_depth = to_tune(2, 10),
early_stopping_set = "test")
# Random Search
xgboost_at_rs = AutoTuner$new(
learner = learner,
resampling = rsmp("cv", folds = 5),
measure = msr("classif.acc"),
terminator = trm("evals", n_evals = 10),
tuner = tnr("random_search"),
callbacks = clbk("mlr3tuning.early_stopping")
)
xgboost_at_rs$train(task)
Note the early_stopping_set = "test"
and callbacks = clbk("mlr3tuning.early_stopping"
lines.
Finally, can you please post the output of sessionInfo()
Hi,
I'm trying to replicate this and the codes didn't work at
xgboost_at_bo$train(task, row_ids = train_indxs)
with the error message:Error in init(env) : For early stopping, watchlist must have at least one element
I tried to remove the early stopping arugment in the learner but that would give me another error at
xgboost_at_bo$train(task, row_ids = train_indxs)
:Error in predict.xgb.Booster(model, newdata = newdata) : Feature names stored in
objectand
newdataare different!
R version 4.3.2 with the latest MLR packages and XGBoost package (GPU enabled binary for Windows)
It looks there're something wrong with the wrapper - but i'm not sure how i can fix that.. Any suggestions would be appreciated. Thanks.
Fred