[BUG]Requiring the length of values for each input to be same seems too strict

jingnanxue commented 2 years ago

Thank you for submitting an issue. Please refer to our issue policy for additional information about bug reports. For help with debugging your code, please refer to Stack Overflow.

Please fill in this bug report template to ensure a timely and thorough response.

Willingness to contribute

The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code base?

[ ] Yes. I can contribute a fix for this bug independently.
[ ] Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.
[x] No. I cannot contribute a bug fix at this time.

System information

Have I written custom code (as opposed to using a stock example script provided in MLflow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Ubuntu 18.04.4
MLflow installed from (source or binary):binary
MLflow version (run mlflow --version):1.19.1
Python version:Python 3.6.10
npm version, if running the dev UI:N/A
Exact command to reproduce:

Describe the problem

Thought this sanity check is too strict. https://github.com/mlflow/mlflow/blob/5784e7e833385e59cd194fd63e4ae5e456abd779/mlflow/utils/proto_json_utils.py#L347

Based on the official documentation of tensorflow RESTful API https://www.tensorflow.org/tfx/serving/api_rest#request_format_2 The value for inputs key can either a single input tensor or a map of input name to tensors (listed in their natural nested form). Each input can have arbitrary shape and need not share the/ same 0-th dimension

For example, if our inputs include two tensors, one tensor with shape (-1, 5) (a batch N of product sequences, each with length 5) and one tensor with shape (-1) (it is a multi-classification problem but we don't calculate the prob for all classes, instead, we randomly pick K classes and only calculate the probs of these k classes for all sequences in this batch). K and N are not necessarily the same in each call.

It works well in tensorflow serving but doesn't work in mlflow serving due to the above restriction.

Code to reproduce issue

curl -H 'Content-Type: application/json' 'localhost:5000/invocations' -d '{
"inputs":{
  "input_seq":[[101,275,323,444,512],[289,303,156,223,357]],
  "input_candidates":[100,101,102,104,107,119,124]
  }
}'
{"error_code": "MALFORMED_REQUEST", "message": "Failed to parse data as TF serving input. The length of values for each input/column name are not the same", "stack_trace": "Traceback (most recent call last):\n  File \"/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mlflow/pyfunc/scoring_server/__init__.py\", line 90, in infer_and_parse_json_input\n    return parse_tf_serving_input(decoded_input, schema=schema)\n  File \"/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mlflow/utils/proto_json_utils.py\", line 244, in parse_tf_serving_input\n    \"Failed to parse data as TF serving input. The length of values for\"\nmlflow.exceptions.MlflowException: Failed to parse data as TF serving input. The length of values for each input/column name are not the same\n"}

Other info / logs

What component(s), interfaces, languages, and integrations does this bug affect?

Components

[ ] area/artifacts: Artifact stores and artifact logging
[ ] area/build: Build and test infrastructure for MLflow
[ ] area/docs: MLflow documentation pages
[ ] area/examples: Example code
[ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
[ ] area/models: MLmodel format, model serialization/deserialization, flavors
[ ] area/projects: MLproject format, project running backends
[ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
[ ] area/server-infra: MLflow Tracking server backend
[ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interface

[ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
[ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
[ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
[ ] area/windows: Windows support

Language

[ ] language/r: R APIs and clients
[ ] language/java: Java APIs and clients
[ ] language/new: Proposals for new client languages

Integrations

[ ] integrations/azure: Azure and Azure ML integrations
[ ] integrations/sagemaker: SageMaker integrations
[ ] integrations/databricks: Databricks integrations

dbczumar commented 2 years ago

@jingnanxue Thank you for raising this. I agree. @tomasatdatabricks can you weigh in here? How difficult would this be to fix?

tomasatdatabricks commented 2 years ago

@dbczumar I think we can juts remove the check cc @arjundc-db who may have more context. I think we should do it to match the tf serving behavior. Thanks for bringing this up @jingnanxue!

dbczumar commented 2 years ago

@arjundc-db Can you weigh in here?

arjundc-db commented 2 years ago

Yes, that makes sense. I will add a PR to change this behavior. I think the intention was to ensure that if a user is inputting instances, every instance should contain all the named tensors. However, this check was incorrectly applied to both instances and inputs. @tomasatdatabricks does it make sense to keep this behavior/modify it iff the json contains an instance.

After the PR the following:

{
        "instances": [
            {"a": "s1", "b": 1},
            {"a": "s2", "b": 2, "c": [4, 5, 6]},
        ]
}

would fail validation, since the first instance does not contain the named tensor c.

tomasatdatabricks commented 2 years ago

@arjundc-db Makes sense to me but I think we want to keep parity with tf serving in this case because we use the tf-serving format. Can you check what is the behavior of tf-serving in this case?

jrbarclay37 commented 2 years ago

Hi, I am also facing the same issue with a model using PyTorch Geometric, which takes two tensors for the inputs. One tensor represents the nodes in the graph and another represents the edges. This works for batch inference, but not for online serving. My format is the same as above in the code to reproduce the issue:

{
       "inputs":{
             "nodes": tensor(num_nodes, num_features),
             "edge_index": tensor(num_edges, 2)
        }
}

It would be extremely helpful if MLflow would accept different lengths for tensor inputs.

jrbarclay37 commented 2 years ago

@tomasatdatabricks @arjundc-db I checked the tf-serving documentation, and it states that the row format instances requires the same 0-th dimension. However, for the columnar format inputs, it explicitly states that "Each input can have arbitrary shape and need not share the/ same 0-th dimension (aka batch size) as required by the row format described above."

https://www.tensorflow.org/tfx/serving/api_rest#request_format_2

mlflow / mlflow