combust / mleap

MLeap: Deploy ML Pipelines to Production
https://combust.github.io/mleap-docs/
Apache License 2.0
1.5k stars 310 forks source link

TypeError: 'float' object is not subscriptable #767

Open muley-atharva opened 3 years ago

muley-atharva commented 3 years ago

I am buliding a Linear Regression model pipeline. While serializing the pipeline I am getting the following error. Traceback (most recent call last): File "train_real_estate_model.py", line 102, in <module> init = True File "/Users/atharva.muley/opt/miniconda3/envs/ds-stg/lib/python3.7/site-packages/mleap/sklearn/pipeline.py", line 29, in serialize_to_bundle serializer.serialize_to_bundle(self, path, model_name, init) File "/Users/atharva.muley/opt/miniconda3/envs/ds-stg/lib/python3.7/site-packages/mleap/sklearn/pipeline.py", line 107, in serialize_to_bundle step_i.serialize_to_bundle(bundle_dir, step_i.name) File "/Users/atharva.muley/opt/miniconda3/envs/ds-stg/lib/python3.7/site-packages/mleap/sklearn/base.py", line 27, in serialize_to_bundle return serializer.serialize_to_bundle(self, path, model_name) File "/Users/atharva.muley/opt/miniconda3/envs/ds-stg/lib/python3.7/site-packages/mleap/sklearn/base.py", line 64, in serialize_to_bundle attributes.append(('intercept', transformer.intercept_.tolist()[0])) TypeError: 'float' object is not subscriptable

Does someone know how to fix this? Thank you!

jsleight commented 3 years ago

This feels like a dependency version mismatch issue. Which version of mleap and which version of scikit learn do you have?

muley-atharva commented 3 years ago

Mleap = 0.17.0 Sklearn = 0.19.2

jsleight commented 3 years ago

sklearn definitely thinks that intercept_ is supposed to be a numpy array. https://github.com/scikit-learn/scikit-learn/blob/0.19.2/sklearn/linear_model/logistic.py#L1106

Though "If fit_intercept is set to False, the intercept is set to zero." In that docstring too. Do you have fit_intercept as false?

muley-atharva commented 3 years ago

No, I have set fitintercept = True. Also, when I type check intercept it says np.float64 not np.ndarray and thats where it fails to convert it to list.

jsleight commented 3 years ago

yeah that is definitely the issue. So depending on your perspective this is either a bug in sklearn to not return ndarray like the API is supposed to return or in mleap for not handling an edge case.

As a workaround you can tweak the intercept_ to be ndarray between where the model is fit and when you serialize. I'd also be happy to review a PR that changes the mleap serialization code to handle either float or ndarray