rodrigo-arenas / Sklearn-genetic-opt

ML hyperparameters tuning and features selection, using evolutionary algorithms.
https://sklearn-genetic-opt.readthedocs.io
MIT License
289 stars 73 forks source link

MLPClassifier - ValueError: shuffle must be either True or False, got True. #77

Closed windowshopr closed 2 years ago

windowshopr commented 2 years ago

System information Windows 10 Sklearn-genetic-opt version: 0.6.1 Scikit-learn version: 0.24.2 Python version: Python 3.7

Describe the bug When using the GASearchCV class with MLPClassifier as the estimator, I get the error in the title. In my param_grid, I simply have it set to Categorical([True, False]), but it doesn't seem to play well. Wondering what could be causing it?

To Reproduce Could recreate it by creating a binary classification dataset from sklearn, then implementing this:

    curr_params = {"shuffle": Categorical([True, False])}

    evolved_estimator = GASearchCV(estimator=MLPClassifier(),
                                   cv=StratifiedKFold(n_splits=2, shuffle=True, random_state=42),
                                   scoring='balanced_accuracy',
                                   population_size=30,
                                   generations=30,
                                   tournament_size=3,
                                   elitism=True,
                                   crossover_probability=0.8,
                                   mutation_probability=0.1,
                                   param_grid=curr_params,
                                   criteria='max',
                                   algorithm='eaMuPlusLambda',
                                   n_jobs=1,
                                   verbose=True,
                                   keep_top_k=1)

Expected behavior Seems to only be an issue with MLPClassifier so far, but should set the parameter shuffle to True or False.

Screenshots

Additional context

windowshopr commented 2 years ago

I've noticed it with PassiveAggressiveClassifier and Perceptron as well, however I see now that above the title error, I'm getting this one:

AttributeError: 'Perceptron' object has no attribute 'base_estimator'

I got that with PassiveAggressiveClassifier too. Is Genetic trying to assign a base_estimator in the source code somewhere?

windowshopr commented 2 years ago

Here's another one, with the full traceback:

SGDClassifier <class 'sklearn.linear_model._stochastic_gradient.SGDClassifier'>
Traceback (most recent call last):
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\site-packages\sklearn_genetic\genetic_search.py", line 396, in evaluate
    local_estimator.base_estimator.set_params(**current_generation_params)
AttributeError: 'SGDClassifier' object has no attribute 'base_estimator'
rodrigo-arenas commented 2 years ago

Hi @windowshopr This looks to be related to issue #65 and there is already a PR merged to control it better (not on PyPI yet), most likely those errors are because you are defining a grid with only 1 hyperparameter to optimize. The way the solver works is by combining several hyperparameters in the cross-over operation, for the SGD I changed this to your code and it works:

curr_params = {"penalty": Categorical(['l1', 'l2']), "epsilon": Continuous(0.001, 1, distribution='log-uniform')}

Give it a try to the other models using at least 2 parameters and let me know if another issue comes up.

windowshopr commented 2 years ago

Haha well, I didn't want to post my ENTIRE param grid, but this is it:

universal_classifier_params = {'algorithm_1': Categorical(['SAMME', 'SAMME.R']), 
                               'algorithm_2': Categorical(['auto', 'full', 'elkan', 'ball_tree', 'kd_tree', 'brute']), 
                               'algorithm_3': Categorical(['auto', 'ball_tree', 'kd_tree', 'brute']), 
                               'base_estimator': Categorical([None]), # Add in some lists of good classifiers to use here for OneVsOneClassifier, etc.
                               'estimator': Categorical([None]), # Add in some lists of good classifiers to use here for OneVsOneClassifier, etc.
                               'learning_rate': Continuous(0.000000001, 1.0, distribution='log-uniform'),
                               'n_estimators': Integer(25,500),
                               'random_state': Categorical([None]), 
                               'bootstrap': Categorical([True, False]),
                               'bootstrap_features': Categorical([True, False]), 
                               'max_features': Continuous(0.1, 1.0, distribution='log-uniform'),
                               'max_samples': Continuous(0.1, 1.0, distribution='log-uniform'),
                               'n_jobs': Categorical([None]), 
                               'oob_score': Categorical([False]), # Can only be true if bootstrap==True
                               'verbose': Integer(0,0),
                               'warm_start': Categorical([True, False]), 
                               'alpha': Continuous(0.1, 1.0, distribution='log-uniform'),
                               'binarize': Categorical([None, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'class_prior': Categorical([None]), 
                               'fit_prior': Categorical([True, False]), 
                               'cv': Categorical([None]), 
                               'ensemble': Categorical([True, False]), 
                               'method': Categorical(['sigmoid', 'isotonic']), 
                               'min_categories': Categorical([None]), 
                               'norm': Categorical([True, False]), 
                               'ccp_alpha': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'class_weight': Categorical([None]), 
                               'criterion_1': Categorical(['gini', 'entropy']), 
                               'criterion_2': Categorical(['friedman_mse', 'mse']), # 'squared_error', 'mae'
                               'max_depth': Categorical([None]), 
                               'max_leaf_nodes': Categorical([None]), 
                               'min_impurity_decrease': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'min_impurity_split': Categorical([None]), 
                               'min_samples_leaf_1': Continuous(0.01, 0.499, distribution='log-uniform'),
                               'min_samples_leaf_2': Integer(1, 50),
                               'min_samples_split': Continuous(0.01, 0.99, distribution='log-uniform'),
                               'min_weight_fraction_leaf': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5]), 
                               'splitter': Categorical(['best', 'random']), 
                               'constant': Categorical([1]), 
                               'strategy': Categorical(['stratified', 'most_frequent', 'prior', 'uniform', 'constant']),
                               'priors': Categorical([None]), 
                               'var_smoothing': Continuous(0.00000000001, 1.0, distribution='log-uniform'),
                               'copy_X_train': Categorical([True, False]), 
                               'kernel': Categorical(['rbf', 'knn']), 
                               'kernel_2': Categorical(['linear', 'poly', 'rbf', 'sigmoid', 'precomputed']),
                               'max_iter_predict': Integer(25,500),
                               'multi_class_1': Categorical(['one_vs_rest', 'one_vs_one']),
                               'multi_class_2': Categorical(['ovr', 'crammer_singer']),
                               'multi_class_3': Categorical(['multinomial', 'ovr', 'auto']),
                               'n_restarts_optimizer': Integer(0,4),
                               'optimizer': Categorical(['fmin_l_bfgs_b']), 
                               'init': Categorical([None]), 
                               'loss_1': Categorical(['deviance', 'exponential']),
                               'loss_2': Categorical(['auto', 'binary_crossentropy']), # , 'categorical_crossentropy'
                               'loss_3': Categorical(['hinge', 'squared_hinge']), # , 'categorical_crossentropy'
                               'n_iter_no_change': Integer(5, 300),
                               'subsample': Continuous(0.1, 1.0, distribution='log-uniform'), 
                               'tol': Categorical([0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001]),
                               'validation_fraction': Categorical([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'categorical_features': Categorical([None]), 
                               'early_stopping': Categorical(['auto', True, False]),
                               'l2_regularization': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'max_bins': Integer(2, 255),
                               'max_iter': Integer(20, 1000),
                               'monotonic_cst': Categorical([None]), 
                               'scoring': Categorical(['loss', 'balanced_accuracy']), # 'squared_hinge', 
                               'leaf_size': Integer(10, 300),
                               'metric': Categorical(['minkowski']), 
                               'metric_params': Categorical([None]), 
                               'n_neighbors': Integer(1, 10),
                               'p': Integer(1, 10),
                               'weights': Categorical(['uniform', 'distance']),
                               'gamma': Integer(5, 100),
                               'covariance_estimator': Categorical([None]), 
                               'n_components': Categorical([None]), 
                               'shrinkage': Categorical([None]), 
                               'solver': Categorical(['svd', 'lbfgs', 'adam', 'sgd']),
                               # 'solver_2': Categorical(['svd', 'lsqr', 'eigen']),
                               'solver_3': Categorical(['lbfgs', 'saga', 'newton-cg']), # , 'sag' causes issues with penalty, only allows l2 or none
                                                                                        # 'liblinear', sucks too
                               'solver_4': Categorical(['sparse_cg', 'cholesky', 'svd', 'lsqr', 'sag', 'saga']),
                               'store_covariance': Categorical([True, False]), 
                               'C': Continuous(0.1, 1.0, distribution='log-uniform'), 
                               'dual': Categorical([True, False]), 
                               'fit_intercept': Categorical([True, False]), 
                               'intercept_scaling': Categorical([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'penalty': Categorical(['l2','l1', 'elasticnet', 'none']), 
                               # 'penalty_2': Categorical(['l2','l1']), 
                               'l1_ratio': Continuous(0.01, 0.999, distribution='log-uniform'), 
                               'Cs': Integer(2, 30),
                               'l1_ratios': Continuous(0.01, 0.999, distribution='log-uniform'), 
                               'refit': Categorical([True, False]), 
                               'activation': Categorical(['relu', 'identity', 'logistic', 'tanh']), 
                               'batch_size': Categorical(['auto']), 
                               'beta_1': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'beta_2': Continuous(0.1, 0.999, distribution='log-uniform'), 
                               'epsilon': Categorical([1e-08, 1e-07, 1e-06, 1e-09, 1e-10, 1e-05]),
                               'hidden_layer_sizes': Categorical([(50,), (100,), (200,), (400,), (800,), (50,1), (100,1), (200,1), (400,1), (800,1), (50,2), (100,2), (200,2), (400,2), (800,2),  (50,4), (100,4), (200,4), (400,4), (800,4)]), 
                               'learning_rate_init': Categorical([0.001, 0.01, 0.1, 1.0, 0.0001, 0.00001, 0.000001]),
                               'max_fun': Integer(500, 50000),
                               'momentum': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'nesterovs_momentum': Categorical([True, False]), 
                               'power_t': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'shuffle': Categorical([True, False]), 
                               'shrink_threshold': Categorical([None]), 
                               'break_ties': Categorical([True, False]), 
                               'break_ties_2': Categorical([False]), 
                               'cache_size': Integer(20, 400),
                               'coef0': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'decision_function_shape': Categorical(['ovo','ovr']), 
                               'degree': Integer(1, 10),
                               'nu': Categorical([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'probability': Categorical([True, False]), 
                               'shrinking': Categorical([True, False]), 
                               'average': Categorical([True, False]), 
                               'eta0': Continuous(0.1, 1.0, distribution='log-uniform'), 
                               'reg_param': Categorical([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]), 
                               'outlier_label': Categorical([None]), 
                               'radius': Continuous(0.1, 1.0, distribution='log-uniform'), 
                               'copy_X': Categorical([True, False]), 
                               'normalize': Categorical([True, False]), 
                               'alphas': Categorical([np.array([ 0.1,  1. , 10. ])]), 
                               'store_cv_values': Categorical([True, False]),
                               }

So there's tons of params, maybe too many, but I basically accounted for every parameter for every sklearn classifier, and manually put them all into a Genetic range, some I doubled up as certain classifiers don't use ALL of the params in the given range, so I define separate ranges to use for those instances.

But looking at the traceback, it suggests that local_estimator doesn't have the attribute base_estimator, which is true for SGD for example, SGD doesn't have a base estimator paramater? Almost like it could be changed from:

local_estimator.base_estimator.set_params(**current_generation_params)

...to...

try:
    local_estimator.base_estimator.set_params(**current_generation_params)
except:
    local_estimator.set_params(**current_generation_params)

Kinda messy yes, but might work for these errors?

rodrigo-arenas commented 2 years ago

I'll have to look this with more detail, which grid exactly did you use for SGD? If you gave that whole dict to the model, it'll fail because not all parameters are valid to all the estimators.

If you selected the ones that apply, the thing is, the base_estimator is not some kind of parameter the package is trying to get, it is a property that comes from the BaseSearchCV, this is what "ensures" that the package is compatible with the way sklearn handles the estimators, some attributes, CV search, etc. I've used this model with some hyperparameters successfully, could you share a code only with what you are using for this model?

I don't explicitly use local_estimator.base_estimator.set_params, that probably comes from the inheritance, the actual code is:


self.estimator = estimator # The one passed to GASearchCV()

local_estimator = clone(self.estimator) 
local_estimator.set_params(**current_generation_params)
windowshopr commented 2 years ago

These were the params that were assigned to the SGDClassifier from that big grid:

SGDClassifier <class 'sklearn.linear_model._stochastic_gradient.SGDClassifier'>

alpha <sklearn_genetic.space.space.Continuous object at 0x000001E683C85988>
average <sklearn_genetic.space.space.Categorical object at 0x000001E683CAED88>
class_weight <sklearn_genetic.space.space.Categorical object at 0x000001E683C85B08>
early_stopping <sklearn_genetic.space.space.Categorical object at 0x000001E683C99B88>
epsilon <sklearn_genetic.space.space.Categorical object at 0x000001E683C9A988>
eta0 <sklearn_genetic.space.space.Continuous object at 0x000001E683CAEB88>
fit_intercept <sklearn_genetic.space.space.Categorical object at 0x000001E683C9A108>
l1_ratio <sklearn_genetic.space.space.Continuous object at 0x000001E683C9A448>
learning_rate <sklearn_genetic.space.space.Continuous object at 0x000001E682FD6E48>
loss <sklearn_genetic.space.space.Categorical object at 0x000001E683C87048>
max_iter <sklearn_genetic.space.space.Integer object at 0x000001E683C99A48>
n_iter_no_change <sklearn_genetic.space.space.Integer object at 0x000001E683C99E88>
n_jobs <sklearn_genetic.space.space.Categorical object at 0x000001E683C85188>
penalty <sklearn_genetic.space.space.Categorical object at 0x000001E683C9A1C8>
power_t <sklearn_genetic.space.space.Categorical object at 0x000001E683C9ACC8>
random_state <sklearn_genetic.space.space.Categorical object at 0x000001E683C85388>
shuffle <sklearn_genetic.space.space.Categorical object at 0x000001E683C9AE48>
tol <sklearn_genetic.space.space.Categorical object at 0x000001E683C99D08>
validation_fraction <sklearn_genetic.space.space.Categorical object at 0x000001E683C99E48>
verbose <sklearn_genetic.space.space.Integer object at 0x000001E683C85208>
warm_start <sklearn_genetic.space.space.Categorical object at 0x000001E683C85908>

So base estimator is not included in there to draw that error.

rodrigo-arenas commented 2 years ago

I recently made a few changes and made a new release, let me know if you are still getting troubles with this

windowshopr commented 2 years ago

Looks like whatever you changed worked! Not getting that error anymore! Thanks! Great work!

jordan-bird commented 1 year ago

It seems this error might be back with nesterovs_momentum in the MLP Classifier

'nesterovs_momentum': Categorical([True, False])

gives the following error:

"ValueError: nesterovs_momentum must be either True or False, got False."