keras-team / keras-io

Keras documentation, hosted live at keras.io
Apache License 2.0
2.69k stars 2.01k forks source link

Cannot export a slightly customized XLMRoberta model from keras_nlp #1863

Closed YangIsNotAvailable closed 1 month ago

YangIsNotAvailable commented 1 month ago

Issue Type

Bug

Source

source

Keras Version

3.3.3

Custom Code

Yes

OS Platform and Distribution

Ubuntu 20.04.6 LTS

Python version

3.10

GPU model and memory

No response

Current Behavior?

Cannot export the model.

Standalone code to reproduce the issue or tutorial link

import keras
from keras_nlp.models import XLMRobertaPreprocessor, XLMRobertaBackbone
import tensorflow as tf

preprocessor = XLMRobertaPreprocessor.from_preset("xlm_roberta_base_multi")
backbone = XLMRobertaBackbone.from_preset("xlm_roberta_base_multi")

inputs = keras.Input(shape=(), dtype=tf.string)
x = preprocessor(inputs)
x = backbone(x)
x = keras.layers.GlobalAveragePooling1D()(x)
outputs = keras.layers.Dense(10)(x)
model = keras.Model(inputs, outputs)

model.compile(optimizer=keras.optimizers.AdamW())

model.export("./test.tfsm")

### Relevant log output

```shell
AssertionError: Tried to export a function which references an 'untracked' resource. TensorFlow objects (e.g. tf.Variable) captured by functions must be 'tracked' by assigning them to an attribute of a tracked object or assigned to an attribute of the main object directly. See the information below:
        Function name = b'__inference_signature_wrapper___call___11987'
        Captured Tensor = <ResourceHandle(name="_0_SentencepieceOp", device="/job:localhost/replica:0/task:0/device:CPU:0", container="localhost", type="tensorflow::text::(anonymous namespace)::SentencepieceResource", dtype and shapes : "[  ]")>
        Trackable referencing this tensor = <tensorflow_text.python.ops.sentencepiece_tokenizer._SentencepieceModelResource object at 0x7fe279bef640>
        Internal Tensor = Tensor("11587:0", shape=(), dtype=resource)
YangIsNotAvailable commented 1 month ago

Submitted the issue to keras-nlp instead.