SeldonIO / MLServer

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
https://mlserver.readthedocs.io/en/latest/
Apache License 2.0
726 stars 183 forks source link

Schema enforcement error when using `PandasCodec` inference request #1625

Open mfueller opened 8 months ago

mfueller commented 8 months ago

Hi!

My goal is to serve a mlflow model with a signature via mlserver and observed some issue with the signature enforcement and the request generated by PandasCodec.

I followed the example from https://mlserver.readthedocs.io/en/latest/examples/mlflow/README.html

The model signature in the example is inferred by

model_signature = infer_signature(train_x, train_y)

and logged to mlflow via:

mlflow.sklearn.log_model(
                lr,
                "model",
                registered_model_name="ElasticnetWineModel",
                signature=model_signature,
            )

model serving is done with mlserver

mlserver start .

I can test the inference via the given plain json example

import requests

inference_request = {
    "inputs": [
        {
          "name": "fixed acidity",
          "shape": [1],
          "datatype": "FP32",
          "data": [7.4],
        },
        {
          "name": "volatile acidity",
          "shape": [1],
          "datatype": "FP32",
          "data": [0.7000],
        },
       ...
    ]
}

endpoint = "http://localhost:8080/v2/models/wine-classifier/infer"
response = requests.post(endpoint, json=inference_request)

response.json()
{'model_name': 'ElasticnetWineModel',
 'id': 'c0dbba4c-ac18-43ba-a408-3cf9536931bd',
 'parameters': {'content_type': 'np'},
 'outputs': [{'name': 'output-1',
   'shape': [1, 1],
   'datatype': 'FP64',
   'parameters': {'content_type': 'np'},
   'data': [5.576883936610762]}]}

However, if I create the inference request using PandasCodec like this

inference_request = PandasCodec.encode_request(test_x.head(1))
endpoint = "http://localhost:8080/v2/models/wine-classifier/infer"
response = requests.post(endpoint, json=inference_request.dict())
response.json()

I get the following error response:

{'error': "mlflow.exceptions.MlflowException: Failed to enforce schema of data '  fixed acidity volatile acidity citric acid  ...       pH sulphates  alcohol\n0       (10.1,)          (0.37,)     (0.34,)  ...  (3.17,)   (0.65,)  (10.6,)\n\n[1 rows x 11 columns]' with schema '['fixed acidity': double (required), 'volatile acidity': double (required), 'citric acid': double (required), 'residual sugar': double (required), 'chlorides': double (required), 'free sulfur dioxide': double (required), 'total sulfur dioxide': double (required), 'density': double (required), 'pH': double (required), 'sulphates': double (required), 'alcohol': double (required)]'. Error: Invalid object type at position 0"}

and mlserver shows the following stack trace:

endpoint-1  | 2024-03-06 15:09:15,804 [mlserver.parallel] ERROR - An error occurred calling method 'predict' from model 'ElasticnetWineModel'.
endpoint-1  | Traceback (most recent call last):
endpoint-1  |   File "lib.pyx", line 2374, in pandas._libs.lib.maybe_convert_numeric
endpoint-1  | TypeError: Invalid object type
endpoint-1  |
endpoint-1  | During handling of the above exception, another exception occurred:
endpoint-1  |
endpoint-1  | Traceback (most recent call last):
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlflow/pyfunc/__init__.py", line 471, in predict
endpoint-1  |     data = _enforce_schema(data, input_schema)
endpoint-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlflow/models/utils.py", line 954, in _enforce_schema
endpoint-1  |     return _enforce_named_col_schema(pf_input, input_schema)
endpoint-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlflow/models/utils.py", line 673, in _enforce_named_col_schema
endpoint-1  |     new_pf_input[name] = _enforce_mlflow_datatype(name, pf_input[name], input_type)
endpoint-1  |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlflow/models/utils.py", line 585, in _enforce_mlflow_datatype
endpoint-1  |     return pd.to_numeric(values, errors="raise")
endpoint-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/pandas/core/tools/numeric.py", line 222, in to_numeric
endpoint-1  |     values, new_mask = lib.maybe_convert_numeric(  # type: ignore[call-overload]  # noqa: E501
endpoint-1  |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
endpoint-1  |   File "lib.pyx", line 2416, in pandas._libs.lib.maybe_convert_numeric
endpoint-1  | TypeError: Invalid object type at position 0
endpoint-1  |
endpoint-1  | During handling of the above exception, another exception occurred:
endpoint-1  |
endpoint-1  | Traceback (most recent call last):
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlserver/parallel/worker.py", line 136, in _process_request
endpoint-1  |     return_value = await method(
endpoint-1  |                    ^^^^^^^^^^^^^
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlserver_mlflow/runtime.py", line 199, in predict
endpoint-1  |     model_output = self._model.predict(decoded_payload)
endpoint-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
endpoint-1  |   File "/usr/local/lib/python3.11/site-packages/mlflow/pyfunc/__init__.py", line 474, in predict
endpoint-1  |     raise MlflowException.invalid_parameter_value(
endpoint-1  | mlflow.exceptions.MlflowException: Failed to enforce schema of data '  fixed acidity volatile acidity citric acid  ...       pH sulphates  alcohol
endpoint-1  | 0       (10.1,)          (0.37,)     (0.34,)  ...  (3.17,)   (0.65,)  (10.6,)
endpoint-1  |
endpoint-1  | [1 rows x 11 columns]' with schema '['fixed acidity': double (required), 'volatile acidity': double (required), 'citric acid': double (required), 'residual sugar': double (required), 'chlorides': double (required), 'free sulfur dioxide': double (required), 'total sulfur dioxide': double (required), 'density': double (required), 'pH': double (required), 'sulphates': double (required), 'alcohol': double (required)]'. Error: Invalid object type at position 0

I had a look on the difference between the plain json schema and the resulting PandasCodec schema. It seems that the shape attribute for the input is different:

plain json example

...
{
          "name": "fixed acidity",
          "shape": [1],
          "datatype": "FP32",
          "data": [7.4],
}
...

PandasCodec result

inference_request = PandasCodec.encode_request(test_x.head(1))
inference_request.dict()
{'parameters': {'content_type': 'pd'},
 'inputs': [{'name': 'fixed acidity',
   'shape': [1, 1],
   'datatype': 'FP64',
   'data': [10.1]},
  {'name': 'volatile acidity',
...

The shape is [1] in the plain json example and [1,1] in the resulting PandasCodec request.

I fixed the shape information and tested it successfully with:

inference_request = PandasCodec.encode_request(test_x.head(1))

for key in inference_request.inputs:
    key.shape = [key.shape[0]]

inference_request.dict()
{'parameters': {'content_type': 'pd'},
 'inputs': [{'name': 'fixed acidity',
   'shape': [1],
   'datatype': 'FP64',
   'data': [10.1]},
...
endpoint = "http://localhost:8080/v2/models/ElasticnetWineModel/infer"
response = requests.post(endpoint, json=inference_request.dict())

response.json()
{'model_name': 'ElasticnetWineModel',
 'id': '5aa69a4a-864b-49cb-99fa-5d2b1afedb2e',
 'parameters': {'content_type': 'np'},
 'outputs': [{'name': 'output-1',
   'shape': [1, 1],
   'datatype': 'FP64',
   'parameters': {'content_type': 'np'},
   'data': [5.731344540042413]}]}

Version Infos:

python version: '3.11.7 (main, Jan 29 2024, 16:03:57) [GCC 13.2.1 20230801]'
mlflow version: '2.10.2'
mlserver version: '1.4.0'

Is there any issue in the generated shape information generated by PandasCodec or have I missed something?

Thanks a lot for any help!

ReveStobinson commented 7 months ago

I have also had this issue, and it's unclear from the docs if this is intended or not. There is a disclaimer about this in the Numpy Array section of the docs, but it is not present in the Pandas DataFrame section just after it, and the presented JSON Payload has other errors that make it unreliable for determining what the correct value should be (I have an MR out for that docs change at #1679)

jesse-c commented 6 months ago

We've merged in a contribution in https://github.com/SeldonIO/MLServer/pull/1751 that should fix this. If you're able to use MLServer from master to get the latest change, could you please let us know if it fixes the issue for you.

ReveStobinson commented 6 months ago

We've merged in a contribution in #1751 that should fix this. If you're able to use MLServer from master to get the latest change, could you please let us know if it fixes the issue for you.

@jesse-c I just had a chance to test this with the new changes. TL;DR is that this works! But not quite in the way I expected it to, and I didn't think it would when I first serialized the new request.

Expand for more. Using the same code snippet, the inference request looked the same with both the MLServer 1.5.0 release, and the current state of `master`: ```python import pandas as pd import mlserver.grpc.converters as converters from mlserver.codecs import PandasCodec example_dict = { "input1": [132.6454, 131.315], "input2": [2.78412, 1.315], "input3": [12.9, 35.6687] } data = pd.DataFrame(example_dict) inference_request = PandasCodec.encode_request(data, inject_batch=True) grpc_inference_request = converters.ModelInferRequestConverter.from_types(inference_request, model_name=model_name, model_version=None) inference_request.inputs ``` Output: ```python [RequestInput(name='input1', shape=[2, 1], datatype='FP64', parameters=None, data=TensorData(__root__=[132.6454, 131.315])), RequestInput(name='input2', shape=[2, 1], datatype='FP64', parameters=None, data=TensorData(__root__=[2.78412, 1.315])), RequestInput(name='input3', shape=[2, 1], datatype='FP64', parameters=None, data=TensorData(__root__=[12.9, 35.6687]))] ``` In 1.5.0, this would result in an error when trying to send it to the inference server, so I would have to amend the code snippet above to add two short lines to reshape the inputs: ```python inference_request = PandasCodec.encode_request(data, inject_batch=True) for i in inference_request.inputs: i.shape = [i.shape[0]] ``` Which would yield a request that looked like this: ```python [RequestInput(name='input1', shape=[2], datatype='FP64', parameters=None, data=TensorData(root=[132.6454, 131.315])), RequestInput(name='input2', shape=[2], datatype='FP64', parameters=None, data=TensorData(root=[2.78412, 1.315])), RequestInput(name='input3', shape=[2], datatype='FP64', parameters=None, data=TensorData(root=[12.9, 35.6687]))] ``` And _that_ version of the request would work in 1.5.0. With the new changes on `master`, it appears that both requests are acceptable and yield the same results. I was expecting to see the PandasCodec change the input shapes to be 1-dimensional given 1-D input data, but it was implemented on the inference runtime side instead.

This change is a welcome one for sure, since it removes the need for those workaround lines in code that might have to handle multiple request types! But I wasn't sure it was going to work when I noticed that my requests looked the same as they did before, so that might be beneficial to mention in release notes somewhere.

Thank you for this!