Closed PanyiDong closed 2 years ago
actual commit should be 4f67da98c95b974ee9264b3bc3ed44b00fbba1e0
For above case, the hyperparameter space is defined in the current version as :
{
"balancing_1": "SimpleRandomOverSampling",
"SimpleRandomOverSampling_imbalance_threshold": tune.uniform(0.8, 1),
},
{
"balancing_2": "SimpleRandomUnderSampling",
"SimpleRandomUnderSampling_imbalance_threshold": tune.uniform(0.8, 1),
},
When config hyperparameter search space, the keys are all unique, which all can be distinguished by HyperOpt
and in the training phase, the redundant prefix ("SimpleRandomUnderSampling_", etc.) and suffix ("_1", etc.) are removed for dict/arguments matching.
Problem
In general case, different methods may contain same hyperparameter (for kNN-style imputation methods, a hyerparameter
k
is critical). Forray.tune
, good thing is different hyperparameters (from different methods with same names) will be automatically recognized and distinguished. However, for HyperOpt, the name of hypeparameter is identified by dictionary keys and also a unique hyperparameter name. So, when defining the default hyperparameter space, for example,imbalance_threshold
inSimpleRandomOverSampling
andimbalance_threshold
fromSimpleRandomUnderSampling
can be distinguished as following:However, for general purpose, I designed a hyperparameter space under ray.tune style which does not allow such naming structure, but defined as following:
So, when using Grid Search/Random Search, no error will raise since it's supported by ray.tune. However, to call search algorithm
HyperOpt
, the problem of duplicate label error will occur. For above case, both imbalance_threshold will be identified asbalancing/imbalance_threshold
and cause HyperOpt unable to properly read hyperparameter space.Reproduction of the problem
Here, I provide a simple example to demonstrate how the problem can occur:
At
analysis1
, the search works smoothly and raise aDuplicateLabel balancing/imbalance_threshold
error atanalysis2
.Current Idea on Solution
Since the problem occurs when converting ray.tune space to hyperopt space, I think when defining the default hyperparameter space, the methods can be added in front of hyperparameter names. And when call the methods, we can remove these prefixes to use the actual hyperparameter names so the hyperparameters can be called properly.
I'm still working on the problem. For now, the GridSearch/RandomSearch option for search algorithm should be fine.