aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
10.1k stars 6.77k forks source link

[Bug Report] Deploying SKLearn Random Forest Model on Endpoint gives DecisionTree Attribute error #3935

Open rookinthenorth opened 1 year ago

rookinthenorth commented 1 year ago

Link to the notebook (https://github.com/rookinthenorth/code_concerns/blob/main/Fraud--v2--public.ipynb)

Describe the bug

  1. After using a vanilla implementation of Random Forest Classifier, when I host the model on an endpoint, I am thrown an error when I make any prediction
  2. Error is: AttributeError: 'DecisionTreeClassifier' object has no attribute 'nfeatures'
  3. Traceback is: Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 93, in wrapper return fn(*args, **kwargs) File "/opt/ml/code/inference.py", line 51, in predict_fn return model.predict(input_data) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 629, in predict proba = self.predict_proba(X) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 673, in predict_proba X = self._validate_X_predict(X) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 421, in _validate_Xpredict return self.estimators[0]._validate_X_predict(X, check_input=True) File "/miniconda3/lib/python3.7/site-packages/sklearn/tree/_classes.py", line 395, in _validate_X_predict if self.nfeatures != n_features:

To reproduce

  1. Use Random Forest Classification model from native sklearn - as this is currently not available as a SageMaker-ready model. Thus, compelled to bring own model and host as endpoint.
  2. Pickle the model locally and then to s3
  3. Create inference.py which will be used by the endpoint
  4. Tar the model and inference.py together - so the endpoint can use them to make predictions
  5. Create Endpoint configs and then deploy the model (in s3) to the endpoint
  6. After verification that the endpoint is active, invoke the endpoint via python
  7. Use just 1 row of test data (command is in last cell of the above notebook)
  8. Get error - go to CloudWatch link included in the error response

Logs

  1. 169.254.178.2 - - [18/Apr/2023:13:05:04 +0000] "GET /ping HTTP/1.1" 200 0 "-" "AHC/2.0"
  2. joblib.version == 1.2.0
  3. input is type <class 'list'>
  4. input = [[1202215.0, 171.0, 15497.0, 490.0, 150.0, 226.0, 299.0, 87.0, 118.5, 231.875, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 169.625, 28.34375, 0.0, 42.34375, 69.8125, 41.625, 146.0, 0.56103515625, 0.0, 146.625, 54.03125, 17.90625, 57.71875, 0.0, -10.171875, 174716.59375, 0.0601806640625, -0.058929443359375, 1.615234375, -6.69921875, 13.2890625, -38.59375, 0.09100341796875, -0.301025390625, 99.75, 48.0625, -344.5, 189.5, 14.234375, 353.25, 404.0, 368.25, 16.0, 12.8046875, 329.5, 149.125, 26.515625, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]
  5. /miniconda3/lib/python3.7/site-packages/sklearn/base.py:213: FutureWarning: From version 0.24, get_params will raise an AttributeError if a parameter cannot be retrieved as an instance attribute. Previously it would return None. FutureWarning)
  6. print model --> RandomForestClassifier(max_features='sqrt', n_estimators=50, random_state=42)
  7. print type(model) --> <class 'sklearn.ensemble._forest.RandomForestClassifier'>
  8. 2023-04-18 13:05:04,841 ERROR - inference - Exception on /invocations [POST]
  9. Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 93, in wrapper return fn(*args, **kwargs) File "/opt/ml/code/inference.py", line 51, in predict_fn return model.predict(input_data) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 629, in predict proba = self.predict_proba(X) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 673, in predict_proba X = self._validate_X_predict(X) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 421, in _validate_Xpredict return self.estimators[0]._validate_X_predict(X, check_input=True) File "/miniconda3/lib/python3.7/site-packages/sklearn/tree/_classes.py", line 395, in _validate_X_predict if self.nfeatures != n_features:
  10. During handling of the above exception, another exception occurred:
  11. Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/miniconda3/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise raise value File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functionsrule.endpoint File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_transformer.py", line 200, in transform self._model, request.content, request.content_type, request.accept File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_transformer.py", line 231, in _default_transform_fn prediction = self._predict_fn(data, model) File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 95, in wrapper six.reraise(error_class, error_class(e), sys.exc_info()[2]) File "/miniconda3/lib/python3.7/site-packages/six.py", line 702, in reraise raise value.with_traceback(tb) File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 93, in wrapper return fn(*args, **kwargs) File "/opt/ml/code/inference.py", line 51, in predict_fn return model.predict(input_data) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 629, in predict proba = self.predict_proba(X) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 673, in predict_proba X = self._validate_X_predict(X) File "/miniconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 421, in _validate_Xpredict return self.estimators[0]._validate_X_predict(X, check_input=True) File "/miniconda3/lib/python3.7/site-packages/sklearn/tree/_classes.py", line 395, in _validate_X_predict if self.nfeatures != n_features:
  12. sagemaker_containers._errors.ClientError: 'DecisionTreeClassifier' object has no attribute 'nfeatures'
  13. (endpoint reverts to normal state) 169.254.178.2 - - [18/Apr/2023:13:05:04 +0000] "POST /invocations HTTP/1.1" 500 290 "-" "AHC/2.0"
Sartius commented 1 year ago

Any news on this, im having the same issue?

Sartius commented 1 year ago

Make sure the scikit version used when creating the model (job file) is the same as the version used when creating the image or at least that they are both either above 1.0 or bellow.