rsteca / sklearn-deap

Use evolutionary algorithms instead of gridsearch in scikit-learn
MIT License
767 stars 132 forks source link

Enabling early-stopping in cv with proper eval_set #54

Open Paperone80 opened 6 years ago

Paperone80 commented 6 years ago

Hi,

is there a way to enable early-stopping as part of EvolutionaryAlgorithmSearchCV when cv=KFold()?

I think I understood, it is not part of the Scikit-learn API because it is not passed onto the fit() function of the estimator.

It would be beneficial for LightGBM and others who provided early_stopping_rounds functionality based on an eval_metric.

Any suggestion for a temporary fix? All it needs is for example to pass fit_params = { early_stopping_rounds= 1000, eval_metric= 'auc', eval_set=[(train_x, train_y), (valid_x, valid_y)]) to fit() with the same fit(train_x, train_y) as in eval_set.

Would a change work in this section? ...
for train, test in cv.split(X, y): assert len(train) > 0 and len(test) > 0, "Training and/or testing not long enough for evaluation." _score = _fit_and_score(estimator=individual.est, X=X, y=y, scorer=scorer, train=train, test=test, verbose=verbose, parameters=parameters, fit_params=fit_params, error_score=error_score)[0] ... Something like": fit_params.update({'eval_set': [(X[train], y[train]),(X[test], y[test])]})

Thanks

hofesh commented 5 years ago

👍 yes please