ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
465 stars 52 forks source link

[Enhancement] Smart error handling and hyperparameter resampling #65

Closed rohan-gt closed 4 years ago

rohan-gt commented 4 years ago

I'm trying out Bayesian optimization but the tuning errors out whenever an incompatible combination of hyperparameters occur. For example if I provide the following param_diststributions for LogisticRegression:

  1. penalty: [‘l1’, ‘l2’, ‘elasticnet’]
  2. solver: [‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’]

it throws an error because elasticnet is only supported by the saga solver and all other combinations are invalid. Is there a way to ignore such cases and just proceed with the tuning by resampling the hyperparameters?

richardliaw commented 4 years ago

Thanks for opening this @rohan-gt! Would it be OK if we proceeded with tuning without resampling?

rohan-gt commented 4 years ago

@richardliaw in TuneSearchCV n_iter is the total number of trials and max_iters the early stopping for the same right? Without resampling won't the trails reduce drastically especially in a case like if 7/10 or even 10/10 cases error out? Like is there a way to keep the number of trials consistent?

richardliaw commented 4 years ago

Hmm... right. Would conditional search spaces (like HyperOpt or Optuna) work for you?

rohan-gt commented 4 years ago

@richardliaw Yes that could actually work. Btw why was skopt chosen over optuna for this implementation? Isn't optuna superior?

richardliaw commented 4 years ago

IIRC Tree-Parzen Estimators (which is the algorithm that optuna implements) shouldn't be much different to standard Bayesian Optimization (that skopt provides).

rohan-gt commented 4 years ago

@richardliaw Yes but scikit-optimize isn't an active project and throws errors with the latest scikit-learn. Also on second thought it would be a much cleaner implementation if it was possible to sample new hyperparameter combinations if existing ones error out than having to specify so many conditional search spaces

richardliaw commented 4 years ago

I think this is addressed with hyperopt thanks to #68!