[CODE EXAMPLE] TF SavedModel - Inference requests without converting to ELWC first

TL;DR

This issue contains code samples how to save a TensorFlow Ranking model with a custom signature that does not require the input data to be wrapped within ExampleListWithContext (ELWC) format, instead we can use raw TF tensors as an input

To: TensorFlow Ranking Team

I could not find a more suitable place to post this, so I am raising an issue about this, so this is not a bug/question.

Motivation

As per subject, I would like to provide this code snippet that I hope, helps the TensorFlow Ranking community if someone has a need for this. I hope this snippet can save people a few hours here and there or/and at least provide some ideas. Also, I do appreciate everyone's feedback. If you notice something weird in the code or you feel that certain things should be done a different way - please do let me know 🙇🏼‍♂️ .

Context

When training and saving a TensorFlow Ranking model, by default the framework saves the model with a serving signature spec that requires the input to the model to be in an ExampleListWithContext (ELWC) format. Although this proto format is a convenient to structure your input data, construction of the ELWC itself can become an overkill when you do this at scale. Since, under the hood, ELWCs get unpacked into the raw tensors anyways, it is possible to skip this conversion and input data into the model as raw tensors instead.

Code

Tested with the following versions of the TensorFlow libraries:

tensorflow==2.5.0
tensorflow-serving-api==2.5.1
tensorflow-ranking==0.4.2

For the sake of example simplicity, consider the following:

The feature pre-processing step (i.e.: be it on the client or on the model) is out of scope for this code.
We have an instance of TensorFlow Ranking model of type keras.Model that we want to save with a custom signature
The model was trained with the ELWC in the following format, where Examples contain only one numeric feature and a label:

examples {
  features {
    feature {
      key: "f0"
      value {
        float_list {
          value: 123.456
        }
      }
    }
    feature {
      key: "label"
      value {
        float_list {
          value: 1.0
        }
      }
    }
  }
}
examples {
  features {
    feature {
      key: "f0"
      value {
        float_list {
          value: 456.567
        }
      }
    }
    feature {
      key: "label"
      value {
        float_list {
          value: 0.0
        }
      }
    }
  }
}
context {
  features {
    feature {
      key: "no_context"
      value {
        float_list {
          value: 0.0
        }
      }
    }
  }
}

Which means, by default, the above ELWC (minus the label feature) structure should be used as a model input during inference time. So, let's save the model with a model signature that expects raw tensors as an input instead of ELWC.

1. Define class extending `tfr.keras.saved_model.Signatures`

This class will be used to define a spec for the new serving method custom signature.

A few key points about the class:

The current implementation of the provided class knows how to work with tensor specs of type tf.TensorSpec and tf.SparseTensorSpec
We create a custom tf.function which accepts as an input the custom signature defintion which is a dictionary with feature names as keys and feature value tensor spec types as dictionary values. The tensor spec types are of type TypeSpecs (i.e.: tf.TensorSpec and tf.SparseTensorSpec)
In the class our custom signature name is called custom_predict_method.
The custom signature spec is created from the model inputs, which are basically the context feature spec and example feature spec

from typing import Callable, Dict, Union

import tensorflow as tf
import tensorflow_ranking as tfr
from tensorflow_ranking.python import data

class SignaturesWithCustomPredict(tfr.keras.saved_model.Signatures):

    def _get_input_signature_tensor_spec(self, model: tf.keras.Model):
        input_signatures = dict()
        for name, tensor in model.input.items():
            tensor_spec = tensor.type_spec
            if isinstance(tensor_spec, tf.TensorSpec):
                input_signatures[name] = tensor_spec
            elif isinstance(tensor_spec, tf.SparseTensorSpec):
                input_signatures[f"{name}_indices"] = tf.TensorSpec(shape=[None, len(tensor_spec.shape)], dtype=tf.int64, name=f"{name}_indices")
                input_signatures[f"{name}_values"] = tf.TensorSpec(shape=[None], dtype=tensor_spec.dtype, name=f"{name}_values")
                input_signatures[f"{name}_dense_shape"] = tf.TensorSpec(shape=[len(tensor_spec.shape)], dtype=tf.int64, name=f"{name}_dense_shape")
            else:
                raise ValueError(f"Unsupported spec: {tensor_spec}")

        return input_signatures

    def custom_predict_method_tf_function(self) -> Callable[[tf.Tensor], Dict[str, tf.Tensor]]:
        """Makes a tensorflow function for `custom_predict_method`."""

        input_signature_tensor_spec = self._get_input_signature_tensor_spec(self._model)

        @tf.function(input_signature=[input_signature_tensor_spec])
        def custom_predict_method(features_dict: Dict[str, tf.Tensor]) -> Dict[str, tf.Tensor]:
            """Defines custom_predict_method signature."""

            inputs = {
                x: features_dict[x] if isinstance(self._model.input[x].type_spec, tf.TensorSpec) else (
                    # String-based features
                    tf.SparseTensor(
                        indices=features_dict[f"{x}_indices"],
                        values=features_dict[f"{x}_values"],
                        dense_shape=features_dict[f"{x}_dense_shape"],
                    )
                )
                for x in self._model.input
            }

            outputs = self._model(inputs=inputs, training=False)
            return self.normalize_outputs(tf.saved_model.PREDICT_OUTPUTS, outputs)

        return custom_predict_method

    def __call__(self, serving_default: str = "regress") -> Dict[str, Callable[[tf.Tensor], Dict[str, tf.Tensor]]]:
        """Returns a dict of signatures.
        Args:
        serving_default: Specifies "regress" or "predict" as the serving_default signature.

        Returns:
        A dict of signatures.
        """
        if serving_default not in ["regress", "predict"]:
            raise ValueError("serving_default should be 'regress' or 'predict', "
                            "but got {}".format(serving_default))

        serving_default_function = (
            self.regress_tf_function() if serving_default == "regress" else self.predict_tf_function()
        )

        signatures = {
            tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
                serving_default_function,
            tf.saved_model.REGRESS_METHOD_NAME:
                self.regress_tf_function(),
            tf.saved_model.PREDICT_METHOD_NAME:
                self.predict_tf_function(),
            "custom_predict_method":
                self.custom_predict_method_tf_function(),
        }
        return signatures

2. Saving the model using the above class

While reading the following code, assume we have an instance variable inference_model, the keras.Model type that we want to save with the custom signature. Let's instantiate the above class by passing context & example feature specs to it.

def context_feature_columns():
  # Code omitted for brevity
   ...
   ...

def example_feature_columns():
  # Code omitted for brevity
   ...
   ...

def create_feature_specs():
    context_feature_spec = \
        tf.feature_column.make_parse_example_spec(context_feature_columns().values())
    example_feature_spec = \
        tf.feature_column.make_parse_example_spec(list(example_feature_columns().values()))

    return context_feature_spec, example_feature_spec

context_feature_spec, example_feature_spec = create_feature_specs()
signatures = SignaturesWithCustomPredict(inference_model, context_feature_spec, example_feature_spec, None)()
output_path = os.path.join("/some/path/to/export/the/model", "00000001")

tf.keras.models.save_model(inference_model,
                            output_path,
                            include_optimizer=False,
                            save_traces=False,
                            signatures=signatures)

3. Sanity checking that the custom signature was saved

We could perform a few sanity checks to make sure that the custom signature was saved:

loaded_inference_model_with_signature = tf.saved_model.load(output_path)
assert "custom_predict_method" in list(loaded_inference_model_with_signature.signatures.keys())

print(list(loaded_inference_model_with_signature.signatures.keys()))

Which should produce:

['serving_default', 'tensorflow/serving/regress', 'tensorflow/serving/predict', 'custom_predict_method']

We could also inspect the saved model signature with saved_model_cli like the following (the command is from Jupyter notebook, hence the ! before the saved_model_cli):

!saved_model_cli show --dir "<MODEL_EXPORTED_DIR>/00000001" --all

Which should produce (I am posting the main part):

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
...
...
...
signature_def['custom_predict_method']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['f0'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 1)
        name: custom_predict_method_f0:0
    inputs['no_context'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: custom_predict_method_no_context:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['outputs'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict
...
...
...

4. Making an inference request

When constructing an inference request payload, since we no longer need to construct ELWC, the main idea here is to wrap any numeric feature value within tf.constant type. First let's create two dictionaries for context and example features:

Context features

context_feature_dict = {
   "no_context": tf.constant(((0.0,),), dtype=tf.float32),
}

Example features

Let's assume that we are making an inference request with five (5) examples. In other words, under the query (i.e.: context) we have five (5) documents/items(i.e.: examples). The five examples' f0 feature (arbitrary) respective values are 1.15, 3.15, 5.15, 8.15, 9.15

# a list containing f0 feature value from every document/item in order
all_examples_f0_feature_values = [1.15, 3.15, 5.15, 8.15, 9.15]

example_feature_dict = {
   "f0": tf.constant([[(x,) for x in all_examples_f0_feature_values]], dtype=tf.float32,),
}

Making inference / prediction request

Now we can use the above context and examples dictionaries, unpack them into one dictionary and use that as a model input by invoking the custom_predict_method signature

features_input_tensor_dict = {**context_feature_dict, **example_feature_dict}
predictor = loaded_inference_model_with_signature.signatures["custom_predict_method"]
output_scores_tensor = predictor(**features_input_tensor_dict)["outputs"]

print(output_scores_tensor.numpy()[0])

Notes

I hope this can provide (partial?) answers to https://github.com/tensorflow/ranking/issues/302

Thanks

Thank you @uniq10 for inspiration and help

tensorflow / ranking