ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
465 stars 52 forks source link

Early stopping with sklearn gradient boosting #151

Closed richardliaw closed 3 years ago

richardliaw commented 3 years ago

I tried to implement with this model and got this error: ValueError: Early stopping is not supported because the estimator does not havepartial_fit, does not support warm_start, or is a tree classifier. Setearly_stopping=False. Then tried to remove the pipeline and use just the model and same error.

Originally posted by @andyolivers in https://github.com/ray-project/tune-sklearn/issues/146#issuecomment-732817255

Yard1 commented 3 years ago

@richardliaw @andyolivers I have ran the following code on tune-sklearn master and Ray release version:

from tune_sklearn import TuneSearchCV
from ray import tune
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor
from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)

clf = GradientBoostingRegressor(loss="ls", random_state=16, verbose=0)
parameters = {  'learning_rate': [.01, 0.1, 0.5, 1],
                'max_depth': [4, 8, 12, 16, 20],
                'min_samples_leaf': [10, 30, 60, 90, 120],
                'min_samples_split': [30, 60, 90, 120],
                'subsample': [0.5, 0.8, 1],
               }
grid_search = TuneSearchCV(
    clf, parameters,
    refit=True,
    sk_n_jobs=1,
    n_jobs=4,
    verbose=2,
    max_iters=10,
    early_stopping=True,
    cv=3)
grid_search.fit(X, y)
print(grid_search.best_estimator_)

It worked for me without any issues. Everything worked as expected.

andyolivers commented 3 years ago

Hi! Many thanks for the help. I just tried to update the library and it is now working fine. I just needed to remove n_estimators from the parameter grid as well.