HunterMcGushion / hyperparameter_hunter

Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
MIT License
706 stars 100 forks source link

Bug optimizing Categorical-only spaces #154

Closed HunterMcGushion closed 5 years ago

HunterMcGushion commented 5 years ago

Conditions to summon bug (assume a blood offering has already been made):

  1. Use BayesianOptPro
  2. Use exclusively Categorical dimensions
  3. (Kinda optional - See details below) At least one of the Categorical dimensions must be in FeatureEngineer

The above 3 conditions will cause BayesianOptPro to break outright. However, problems still arise when omitting FeatureEngineer, while searching an exclusively multi-Categorical space. The effect of this bug is that saved experiment results are not properly marked as similar_experiments. Tests demonstrating the behavior of this second bug are added in 4b60f1a. Tests for both variants of the bug are added in #152.

When the two xfail tests are marked as xpass, the bug is probably (hopefully) fixed.

Code to reproduce error:

from hyperparameter_hunter import Environment, BayesianOptPro, FeatureEngineer
from hyperparameter_hunter import Categorical, Integer
from hyperparameter_hunter.utils.learning_utils import get_boston_data
from sklearn.ensemble import AdaBoostRegressor

def nothing_transform(train_targets, non_train_targets):
   return train_targets, non_train_targets, lambda _: _

env = Environment(
   train_dataset=get_boston_data(),
   results_path="HHAssets",
   target_column="DIS",
   metrics=["median_absolute_error"],
   cv_params=dict(n_splits=3, random_state=1),
)

opt_1 = BayesianOptPro(iterations=5, random_state=32, n_initial_points=1)
opt_1.set_experiment_guidelines(
   model_initializer=AdaBoostRegressor,
   model_init_params=dict(),                                 # FLAG: This breaks
   # model_init_params=dict(n_estimators=Integer(10, 100)),  # FLAG: This works
   feature_engineer=FeatureEngineer([
      Categorical([nothing_transform], optional=True),
   ]),
)
opt_1.go()

Notes