bmurauer / pipelinehelper

scikit-helper to hot-swap pipeline elements
GNU General Public License v3.0
21 stars 9 forks source link

Accessing the best_estimator_ attribute #10

Closed browshanravan closed 4 years ago

browshanravan commented 4 years ago

Apologies for the extensive correction, but my question is asked better as follows.

I need to access the best_estimator_ attribute parameters. The code for this is print(grid.get_params().keys()). So if you use this code in this example code, the output is as follows.


['cv',
 'error_score',
 'estimator',
 'estimator__clf',
 'estimator__clf__available_models',
 'estimator__clf__optional',
 'estimator__clf__selected_model',
 'estimator__memory',
 'estimator__preprosessor',
 'estimator__preprosessor__C_Fimp',
 'estimator__preprosessor__C_Fimp__cat_imputer',
 'estimator__preprosessor__C_Fimp__cat_imputer__add_indicator',
 'estimator__preprosessor__C_Fimp__cat_imputer__copy',
 'estimator__preprosessor__C_Fimp__cat_imputer__fill_value',
 'estimator__preprosessor__C_Fimp__cat_imputer__missing_values',
 'estimator__preprosessor__C_Fimp__cat_imputer__strategy',
 'estimator__preprosessor__C_Fimp__cat_imputer__verbose',
 'estimator__preprosessor__C_Fimp__memory',
 'estimator__preprosessor__C_Fimp__onehot',
 'estimator__preprosessor__C_Fimp__onehot__categories',
 'estimator__preprosessor__C_Fimp__onehot__drop',
 'estimator__preprosessor__C_Fimp__onehot__dtype',
 'estimator__preprosessor__C_Fimp__onehot__handle_unknown',
 'estimator__preprosessor__C_Fimp__onehot__sparse',
 'estimator__preprosessor__C_Fimp__steps',
 'estimator__preprosessor__C_Fimp__verbose',
 'estimator__preprosessor__N_Fimp',
 'estimator__preprosessor__N_Fimp__memory',
 'estimator__preprosessor__N_Fimp__num_imputer',
 'estimator__preprosessor__N_Fimp__num_imputer__add_indicator',
 'estimator__preprosessor__N_Fimp__num_imputer__copy',
 'estimator__preprosessor__N_Fimp__num_imputer__fill_value',
 'estimator__preprosessor__N_Fimp__num_imputer__missing_values',
 'estimator__preprosessor__N_Fimp__num_imputer__strategy',
 'estimator__preprosessor__N_Fimp__num_imputer__verbose',
 'estimator__preprosessor__N_Fimp__steps',
 'estimator__preprosessor__N_Fimp__verbose',
 'estimator__preprosessor__n_jobs',
 'estimator__preprosessor__remainder',
 'estimator__preprosessor__sparse_threshold',
 'estimator__preprosessor__transformer_weights',
 'estimator__preprosessor__transformers',
 'estimator__preprosessor__verbose',
 'estimator__steps',
 'estimator__verbose',
 'iid',
 'n_jobs',
 'param_grid',
 'pre_dispatch',
 'refit',
 'return_train_score',
 'scoring',
 'verbose']

As you can see, the parameters for the bestestimator (lets say the best estimator/selected estimator is ExtraTreesClassifier ) such as n_estimators is not accessible. The output for print(grid.get_params().keys()) should have produced something like estimator__clf__selected_model__ExtraTreesClassifier__n_estimators. I need these parameters for feeding into validation_curve()

bmurauer commented 4 years ago

I see the problem - the get_params() does not include sub-parameters as the scikit-API requires. I will probably be abe to fix this on Wednesday.

bmurauer commented 4 years ago

With 0.7.6, you should be able to call grid.best_estimator_.get_params() and get the requested keys. Note that the field selected_model is always empty on the grid object itself, as the model is a parameter of the estimator.

On the other hand, the parameters of the other available models should now also be set on grid.get_params(). Thank you for improving my project :-)