ClimbsRocks / auto_ml

[UNMAINTAINED] Automated machine learning for analytics & production
http://auto-ml.readthedocs.io
MIT License
1.64k stars 310 forks source link

TypeError: 'int' object is not subscriptable #322

Closed adam-haber closed 7 years ago

adam-haber commented 7 years ago

Hi,

I'm trying to use auto_ml, and training crashes after what seems to be the end of the GradientBoostingRegressor pipeline.

This is the last "running" output:

[1920] random_holdout_set_from_training_data's score is: -1.454
[1940] random_holdout_set_from_training_data's score is: -1.454
The number of estimators that were the best for this training dataset: 1540
The best score on a random 15 percent holdout set of the training data: -1.45299580275
Finished training the pipeline!
Total training time:
0:59:52

Here are the results from our GradientBoostingRegressor
predicting a1
Calculating feature responses, for advanced analytics.

After which I get a long traceback that ends with:

C:\Users\adamh\Anaconda2\envs\py3\lib\site-packages\auto_ml\predictor.py in create_feature_responses(self, model, X_transformed, y, top_features)
    887             col_result = {}
    888             col_result['Feature Name'] = col_name
--> 889             if col_name[:4] != 'nlp_' and '=' not in col_name and self.column_descriptions.get(col_name, False) != 'categorical':
    890 
    891                 col_std = np.std(X[:, col_idx])

TypeError: 'int' object is not subscriptable

Any ideas on how to solve this?

ClimbsRocks commented 7 years ago

thanks for filing this! i've never seen this one before, which makes me wonder if if might be specific to your dataset.

that error message makes me think it's failing on col_name[:4]. any chance any of your column names are integers? i hadn't realized we were making the assumption that column names aren't integers, but it appears we are.

if that doesn't fix it, i've got a couple of asks to debug further

  1. what version of auto_ml are you on, and can you upgrade to the latest version (just released v2.7.0 last night- you can get it with pip install --upgrade auto_ml)
  2. can you give me a full traceback?
  3. if possible, a copy or a small sample of your dataset would be very helpful in debugging more!

but, my hope is it's something simple like we're assuming actual column names, and you're using ints for column names.

adam-haber commented 7 years ago

Changing the column names from int to string did the job! Thanks. :-)

On Tue, Sep 12, 2017 at 7:20 PM, Preston Parry notifications@github.com wrote:

thanks for filing this! i've never seen this one before, which makes me wonder if if might be specific to your dataset.

that error message makes me think it's failing on col_name[:4]. any chance any of your column names are integers? i hadn't realized we were making the assumption that column names aren't integers, but it appears we are.

if that doesn't fix it, i've got a couple of asks to debug further

  1. what version of auto_ml are you on, and can you upgrade to the latest version (just released v2.7.0 last night- you can get it with pip install --upgrade auto_ml)
  2. can you give me a full traceback?
  3. if possible, a copy or a small sample of your dataset would be very helpful in debugging more!

but, my hope is it's something simple like we're assuming actual column names, and you're using ints for column names.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ClimbsRocks/auto_ml/issues/322#issuecomment-328905313, or mute the thread https://github.com/notifications/unsubscribe-auth/AeXqSYTHiB1w6bYbKoWgJ0oUjLdetpOZks5shq9agaJpZM4PUjFm .

ClimbsRocks commented 7 years ago

glad that worked! thanks for filing the issue- that's one more usability bug that we'll try to improve when we get the chance. don't hesitate to reach out with any other questions/bugs/feature requests!