Tensorflow-text Ops when serving saved model over Tensorflow C-API

I have an application where I want to use a saved tensorflow model for serving. I tried to serve pre-trained huggingface models for this purpose, for example a Bert model. In order to combine both tokenizer and model into a single custom model, I build and saved my model as follows:

import tensorflow as tf
from transformers import TFBertForSequenceClassification, TFBertTokenizer

class EndToEndModel(tf.keras.Model):
    def __init__(self, checkpoint):
        super().__init__()
        self.tokenizer = TFBertTokenizer.from_pretrained(checkpoint)
        self.model = TFBertForSequenceClassification.from_pretrained(checkpoint)

    def call(self, inputs):
        tokenized = self.tokenizer(inputs)
        return = self.model(**tokenized)

model = EndToEndModel(checkpoint="bert-base-cased")
# Run the model once to get input/output signature set
test_inputs = [[("My First String"),("")], [("Second Example String"),("")]]
tensor = tf.convert_to_tensor(test_inputs)
output = model.predict(tensor)
model.save("mymodel")

I can load the model again in Python by specifying custom objects used in the model as follows:

model = tf.keras.models.load_model("mymodel", custom_objects={"TFBertForSequenceClassification": TFBertForSequenceClassification})

I did not find a solution yet to load the model over the C-API in the application. Code is like follows:

const char* tags = "serve";
int ntags = 1;
TF_SessionOptions *session_opts = TF_NewSessionOptions();
TF_Graph *graph = TF_NewGraph();
TF_Status *status = TF_NewStatus();
TF_Session *session = TF_LoadSessionFromSavedModel(session_opts, NULL, model_path, &tags, ntags, graph, NULL, status);

When running, I get the following output:

tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:134] tfg_optimizer{any(tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export)} failed: INVALID_ARGUMENT: Unable to find OpDef for FastBertNormalize
        While importing function: __inference__wrapped_model_422962
        when importing GraphDef to MLIR module in GrapplerHook

and the error message retreived from the status is

Op type not registered 'FastBertNormalize' in binary running on ubuntu. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

There have been similar issues in tensorflow/serving, e.g. https://groups.google.com/a/tensorflow.org/g/developers/c/LUvQAm3BsAs, and there is also an explanation how to use custom ops in serving: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/custom_op.md. These focus on usage in Python. Is there any any way to use tensorflow-text operations in models served using the tensorflow C-API? I suppose that there would be a library needed that can be linked against the application, similar to the tensorflow c libraries.

Python modules tensorflow and tensorflow-text as well as the tensorflow c-api libraries are in version 2.13.0.

tensorflow / text

Tensorflow-text Ops when serving saved model over Tensorflow C-API #1215