ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
465 stars 52 forks source link

ValueError: list of dictionaries for parameters is not supported for non-random search #245

Open RNarayan73 opened 2 years ago

RNarayan73 commented 2 years ago

Hello, I am trying to assign a parameter search space to param_distributions that is a list of distinct search spaces. This is to enable me to search through multiple dimension reduction and feature selection algorithms and also search for optimal parameters for them. As a trivial example, if I would like to use a pipeline with a structure like this:

Pipeline[
encode, 
dim_red, 
classify]

where dim_red has search spaces like this:

dim_red = [SelectKBest(), SelectPercentile(), PCA()]
dim_red__k = [4,  6, 8, 10]
dim_red__score_func = [f_classif, mutual_info_classif]
dim_red__percentile = [10, 20, 30, 40, 50]
dim_red__n_components = [4, 8, 10, 20]

encode and classify steps both have their own set of search spaces, but not elaborated for brevity.

I would need to create a list of dicts with all valid combinations of the search spaces for each of the dim_red algorithms

[
{encode: encode_hp_search_space,
dim_red: [SelectKBest()],
dim_red__k: [4,  6, 8, 10],
dim_red__score_func: [f_classif, mutual_info_classif],
classify: classify_hp_search_space},

{encode: encode_hp_search_space,
dim_red: [SelectPercentile()],
dim_red__percentile = [10, 20, 30, 40, 50], 
dim_red__score_func: [f_classif, mutual_info_classif],
classify: classify_hp_search_space},

{encode: encode_hp_search_space,
dim_red: [PCA()],
dim_red__n_components = [4, 8, 10, 20], 
classify: classify_hp_search_space}
]

However, TuneSearchCV doesn't accept such a list of dicts to represent the search spaces, throwing up an error as in the title. BaySearchSV works just fine.

I have the following Questions: 1) Is there a way to use TuneSearchCV to search through all the search spaces described above? 2) Is there an alternative way to represent the search spaces that SKLearn accepts that I am ignorant about?

Any steer would be appreciated.

RNarayan73

Yard1 commented 2 years ago

Hey @RNarayan73, this docs page should help - https://docs.ray.io/en/latest/tune/tutorials/tune-search-spaces.html#tune-custom-search. Let me know if you have any more questions :)

RNarayan73 commented 2 years ago

Thanks for the steer @Yard1. So far, all my code has been entirely SKLearn compatible which allows me to reuse the same workflows for various HP search techniques such as BayesSearchCV, TuneSearchCV, OptunaSearchCV, EvolutionarySearchCV etc. with minimal rework. If I went down the route suggested in the help doc, and coded for the Ray Tune API, could it be used with TuneSearchCV in SKLearn as well?

Yard1 commented 2 years ago

Yes, you should be able to use any tune.run compatible search space with TuneSearchCV. If that doesn't work, let us know - we'll fix :)