Closed statist-bhfz closed 3 years ago
Why did you close? Did you find a better solution?
@mllg It was completely wrong approach as far as we can't process validation data during training phase (I also closed second PR https://github.com/mlr-org/mlr3learners/pull/219). The only solution I have found is to create separate task for preprocessing, train it on training data, make predictions (produce transformed dataset) on validation dataset (separate or the same one used for other hyperparameters tuning) and pass this transformed dataset as xgb.DMatrix
in watchlist
.
It works great in real tasks despite the some overfitting danger if (cross-)validation is used on the same train dataset.
Another promising approach is described in https://github.com/mlr-org/mlr3/issues/716. It looks even better than early stopping, but I haven't try it yet.
In general, it is quite wide topic. I can make brief vignette summarizing my experience if it's acceptable.
In general, it is quite wide topic. I can make brief vignette summarizing my experience if it's acceptable.
Feedback is always welcome!
Hello! I am using xgboost to do classification. I splitted the data into training and test set and used cross_validation to tune hyperparameters on the training set. I tried to use xgb.cv to manually tune hyperparameters. And early_stopping could be implemented well into xgb.cv to tune hyperparameters manually. Can early_stopping be correctly implemented in mlr or mlr3 when using xgboost? Since I found another post mentioning that using xgboost's early_stopping_rounds is not really useful within the mlr framework. Thank you so much for your help.
My attempt to solve https://github.com/mlr-org/mlr3tuning/issues/216 for xgboost inspired by https://github.com/mlr-org/mlr3extralearners/blob/main/R/learner_lightgbm_classif_lightgbm.R If it is ok, I will make same fixes for other xgb learners and for catboost in mlr3extralearners Short test provided in gist https://gist.github.com/statist-bhfz/2151c26107c8922a344b7004fe64f26a