tensorflow / text

Making text a first-class citizen in TensorFlow.
https://www.tensorflow.org/beta/tutorials/tensorflow_text/intro
Apache License 2.0
1.21k stars 333 forks source link

Tensorflow-text Ops when serving saved model over Tensorflow C-API #1215

Closed Sklaebe closed 4 months ago

Sklaebe commented 10 months ago

I have an application where I want to use a saved tensorflow model for serving. I tried to serve pre-trained huggingface models for this purpose, for example a Bert model. In order to combine both tokenizer and model into a single custom model, I build and saved my model as follows:

import tensorflow as tf
from transformers import TFBertForSequenceClassification, TFBertTokenizer

class EndToEndModel(tf.keras.Model):
    def __init__(self, checkpoint):
        super().__init__()
        self.tokenizer = TFBertTokenizer.from_pretrained(checkpoint)
        self.model = TFBertForSequenceClassification.from_pretrained(checkpoint)

    def call(self, inputs):
        tokenized = self.tokenizer(inputs)
        return = self.model(**tokenized)

model = EndToEndModel(checkpoint="bert-base-cased")
# Run the model once to get input/output signature set
test_inputs = [[("My First String"),("")], [("Second Example String"),("")]]
tensor = tf.convert_to_tensor(test_inputs)
output = model.predict(tensor)
model.save("mymodel")

I can load the model again in Python by specifying custom objects used in the model as follows:

model = tf.keras.models.load_model("mymodel", custom_objects={"TFBertForSequenceClassification": TFBertForSequenceClassification})

I did not find a solution yet to load the model over the C-API in the application. Code is like follows:

const char* tags = "serve";
int ntags = 1;
TF_SessionOptions *session_opts = TF_NewSessionOptions();
TF_Graph *graph = TF_NewGraph();
TF_Status *status = TF_NewStatus();
TF_Session *session = TF_LoadSessionFromSavedModel(session_opts, NULL, model_path, &tags, ntags, graph, NULL, status);

When running, I get the following output:

tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:134] tfg_optimizer{any(tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export)} failed: INVALID_ARGUMENT: Unable to find OpDef for FastBertNormalize
        While importing function: __inference__wrapped_model_422962
        when importing GraphDef to MLIR module in GrapplerHook

and the error message retreived from the status is

Op type not registered 'FastBertNormalize' in binary running on ubuntu. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

There have been similar issues in tensorflow/serving, e.g. https://groups.google.com/a/tensorflow.org/g/developers/c/LUvQAm3BsAs, and there is also an explanation how to use custom ops in serving: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/custom_op.md. These focus on usage in Python. Is there any any way to use tensorflow-text operations in models served using the tensorflow C-API? I suppose that there would be a library needed that can be linked against the application, similar to the tensorflow c libraries.

Python modules tensorflow and tensorflow-text as well as the tensorflow c-api libraries are in version 2.13.0.

cantonios commented 9 months ago

This seems more like a TF question, and probably one they would say to ask on Stack Overflow.

Tensorflow-text doesn't currently expose a static library containing all the ops, but you could probably create one based on the :ops_lib target. Otherwise, you need to load all the op libraries individually - for example, the pip package contains the shared library: python/ops/_fast_bert_normalizer.so, which when loaded should register the op.