ClimbsRocks / auto_ml

[UNMAINTAINED] Automated machine learning for analytics & production
http://auto-ml.readthedocs.io
MIT License
1.64k stars 310 forks source link

optimize_final_model does not work with a classification pipeline #359

Open rap9430 opened 6 years ago

rap9430 commented 6 years ago

Whenever I set optimize_final_model=True with type_of_estimator='classification' (for all classifiers), I get the following error: AttributeError: 'NoneType' object has no attribute 'score'

FYI, everything works fine with the regression pipelines

Find the Traceback below:

Traceback (most recent call last): File "", line 14, in File "/lib/python3.5/site-packages/auto_ml/predictor.py", line 638, in train self.trained_final_model = self.train_ml_estimator(self.model_names, self._scorer, X_df, y) File "/lib/python3.5/site-packages/auto_ml/predictor.py", line 1199, in train_ml_estimator gscv_results = self.fit_grid_search(X_df, y, grid_search_params, feature_learning=feature_learning) File "/lib/python3.5/site-packages/auto_ml/predictor.py", line 1077, in fit_grid_search scoring=self._scorer.score, AttributeError: 'NoneType' object has no attribute 'score'

ClimbsRocks commented 6 years ago

it's probably one of the parameters that's being passed in. we have some level of input verification, but not a ton, so it's possible some input's being passed in that doesn't work.

could you share that part of your training script- where you create ml_predictor = Predictor() and ml_predictor.train()?

rap9430 commented 6 years ago

I am not sure that the following is quite informative, but here is the part of the script you asked for.

column_descriptions = { 'col_numeric': 'output', 'nlp_col': 'nlp' }

ml_predictor = Predictor(type_of_estimator='classification', column_descriptions=column_descriptions) ml_predictor.train(data_train, verbose = False, optimize_final_model=True, ml_for_analytics=False)

sharan-amutharasu commented 6 years ago

Facing the same issue.

'AttributeError: 'NoneType' object has no attribute 'score'

column_descriptions = { 'op': 'output' }

ml_predictor = Predictor(type_of_estimator='classifier', column_descriptions=column_descriptions) ml_predictor.train(data, verbose = False, optimize_final_model=True)

"verbose = False" is also not working in this case. I'm still getting full logging.

dhrishya commented 5 years ago

Hi,

Is this working for any of you now? I seem to be getting the same error. I ran 3 modifications of which 1 ran-

Input Data:

df_iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv') df_iris.head(10)

First method:

col_desc = { 'species' : 'output' }

ML hyperparameters for auto_ml:

automl_kwargs = { 'model_names' : ['XGBClassifier'],

'perform_feature_selection' : True, # only necessary for > 100K features

'verbose' : True,
'ml_for_analytics' : True,
'take_log_of_y' : True,
'perform_feature_scaling' : True,
'cv' : 9,
# 'optimize_final_model' : True

}

aml_pred = auto_ml.Predictor('classifier',column_descriptions=col_desc) aml_pred.train(df_iris, **automl_kwargs)

This ran without any error

Second Method:

automl_kwargs2 = automl_kwargs.copy() automl_kwargs2['optimize_final_model'] = True aml_pred2 = auto_ml.Predictor('classifier',column_descriptions=col_desc) aml_pred2.train(df_iris, **automl_kwargs2)

Throws the following error: AttributeError: 'NoneType' object has no attribute 'score'

Third Method:

automl_kwargs3 = automl_kwargs.copy() automl_kwargs3['model_names'] = ['XGBClassifier','GradientBoostingClassifier']

aml_pred3 = auto_ml.Predictor('classifier',column_descriptions=col_desc) aml_pred3.train(df_iris, **automl_kwargs3)

AttributeError: 'NoneType' object has no attribute 'score'

The idea is to try and invoke grid search. I have a larger dataset but I'm trying to debug this with a sample data available online.

aidiss commented 5 years ago

I face same error when using not default model. I tried some like LogisticRegression Ada and RandomForestClassifier but they all responded with 'NoneType' object has no attribute 'score'


   1160                 error_score=-1000000000,
-> 1161                 scoring=self._scorer.score,
   1162                 # Don't allocate memory for all jobs upfront. Instead, only allocate enough memory to handle the current jobs plus an additional 50%
   1163                 pre_dispatch='1.5*n_jobs',

AttributeError: 'NoneType' object has no attribute 'score'