[Feature request] Early stopping doesn't work with XGBoost, LightGBM or CatBoost

ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.

https://docs.ray.io/en/master/tune/api_docs/sklearn.html

Apache License 2.0

465 stars 52 forks source link

[Feature request] Early stopping doesn't work with XGBoost, LightGBM or CatBoost #58

Closed rohan-gt closed 4 years ago

rohan-gt commented 4 years ago

I'm getting the following error while setting early_stopping=True

ValueError: Early stopping is not supported because the estimator does not have `partial_fit`

These could be a potential fixes:

XGBoost: https://github.com/dmlc/xgboost/issues/1686
LightGBM: https://github.com/microsoft/LightGBM/issues/2718
CatBoost: https://github.com/catboost/catboost/issues/464

richardliaw commented 4 years ago

Yeah, that's a good point. I think we'll probably want to special-case a training call for xgboost, lightgbm as they have a different way of doing early stopping..

rohan-gt commented 4 years ago

@richardliaw I've updated the issue with more details. Btw how does max_iters perform early stopping? I'm a little confused about how it works in conjunction with n_iter

richardliaw commented 4 years ago

yeah... i guess that's the penalty we have to pay for adhering to the sklearn API.

max_iters = number of "epochs" n_iter = number of hyperparameter evals.

Does that make sense?

richardliaw commented 4 years ago

In #63, we're going to enable early stopping for XGBoost via incremental learning. We decided not to implement it for lgbm because it is not yet on a stable version.

richardliaw commented 4 years ago

Hmm, not sure how we're going to support CatBoost but will open an issue to track lightgbm.

rohan-gt commented 4 years ago

@richardliaw is early stopping enabled using cross-validation like I mentioned here? Because CatBoost has a cv() method too