automl / auto-sklearn

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
BSD 3-Clause "New" or "Revised" License
7.59k stars 1.28k forks source link

KNearestNeighborsRegressor has no attribute 'estimator' when printing show_models() #1625

Open MrKevinDC opened 1 year ago

MrKevinDC commented 1 year ago

I have tried to print the models composing the best ensemble with show_models(), but it fails if a k_nearest_neighbours_regressor is one of them. Is this due to this component not having an initialised self.estimator? I am making a custom component with that modification now, and will update this issue if said model comes up in the ensemble again (whether it fixes it or not).

automl.leaderboard() rank ensemble_weight type cost duration model_id
826 1 0.34 decision_tree 0.556544 3.839919 742 2 0.42 k_nearest_neighbors 0.563224 2.659213 1856 3 0.24 adaboost 0.570269 9.341588 automl.show_models() Traceback (most recent call last): File "/gpfs/home/xxx/automlBiscuits.py", line 40, in pprint(automl.show_models(), indent=4) File "/gpfs/home/xxx/miniconda3/lib/python3.9/site-packages/autosklearn/estimators.py", line 888, in showmodels return self.automl.show_models() File "/gpfs/home/xxx/miniconda3/lib/python3.9/site-packages/autosklearn/automl.py", line 2227, in show_models ] = autosklearn_wrapped_model.choice.estimator AttributeError: 'KNearestNeighborsRegressor' object has no attribute 'estimator'

eddiebergman commented 1 year ago

Hi @MrKevinDC,

This seems like a bug, I checked the estimator and this does actually seem to be set if the KNN has been fit: https://github.com/automl/auto-sklearn/blob/66b782a8dde3d182aa8a5532ee66b0172adb01f5/autosklearn/pipeline/components/regression/k_nearest_neighbors.py#L21-L40

I'll look into this while I can but thanks for reporting it!

MrKevinDC commented 1 year ago

This seems like a bug, I checked the estimator and this does actually seem to be set if the KNN has been fit:

Yes, there is no problem with fitting, predicting, fetching the run history, etc; it is just with show_models(). Thank you for your reply and congratulations on this awesome package by the way.

MrKevinDC commented 1 year ago

I think the issue can indeed be solved by simply initializing in init the self.estimator=None. I reran the algorithm with a custom KNearestNeighborsRegressor that only had this modification, and one of such type appeared in the show_models()print without causing an error.

The run history with details of this fitted custom regressor :

... {'cost_2.64485719244043_ensembleWeight_0.16': Configuration(values={ 'data_preprocessor:choice': 'feature_type', 'data_preprocessor:feature_type:numerical_transformer:imputation:strategy': 'most_frequent', 'data_preprocessor:feature_type:numerical_transformer:rescaling:choice': 'minmax', 'feature_preprocessor:CustomFeatureAgglomeration:affinity': 'cosine', 'feature_preprocessor:CustomFeatureAgglomeration:linkage': 'average', 'feature_preprocessor:CustomFeatureAgglomeration:n_clusters': 11, 'feature_preprocessor:CustomFeatureAgglomeration:pooling_func': 'max', 'feature_preprocessor:choice': 'CustomFeatureAgglomeration', 'regressor:CustomKNearestNeighborsRegressor:n_neighbors': 6, 'regressor:CustomKNearestNeighborsRegressor:p': 2, 'regressor:CustomKNearestNeighborsRegressor:weights': 'uniform', 'regressor:choice': 'CustomKNearestNeighborsRegressor', }) ...

The show_models() did not break:

automl.show_models() { 3: { 'cost': 2.6638271909003906, 'data_preprocessor': <autosklearn.pipeline.components.data_preprocessing.DataPreprocessorChoice object at 0x145653845e20>, 'ensemble_weight': 0.02, 'feature_preprocessor': <autosklearn.pipeline.components.feature_preprocessing.FeaturePreprocessorChoice object at 0x145653845d30>, 'model_id': 3, 'rank': 1, 'regressor': <autosklearn.pipeline.components.regression.RegressorChoice object at 0x145653845a90>, 'sklearn_regressor': None}, ...