tensorflow / hub

A library for transfer learning by reusing parts of TensorFlow models.
https://tensorflow.org/hub
Apache License 2.0
3.49k stars 1.67k forks source link

Bug: Can't save model after altering seq_length #845

Closed MrMwalker2 closed 2 years ago

MrMwalker2 commented 2 years ago

What happened?

Hi there,

I'm trying to retrain the pre-trained BERT model from TF Hub. I'm using the following preprocessor: https://tfhub.dev/tensorflow/bert_multi_cased_preprocess/3 along with the multilingual encoder https://tfhub.dev/tensorflow/bert_multi_cased_L-12_H-768_A-12/4'

I'm running tf 2.5.0 and the latest version of tf hub.

When I'm running the code as shown here under basic usage

text_input = tf.keras.layers.Input(shape=(), dtype=tf.string)
preprocessor = hub.KerasLayer(
    "https://tfhub.dev/tensorflow/bert_multi_cased_preprocess/3")`
encoder_inputs = preprocessor(text_input)

I can successfully save the model as .h5 to use for serving.

However, when I try to increase the seq_length like this:

text_input = [tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')]
preprocessor = hub.load(TFHUB_PREPROCESSER)
tokenize = hub.KerasLayer(preprocessor.tokenize)
tokenized_inputs = [tokenize(segment) for segment in text_input]
bert_pack_inputs = hub.KerasLayer(preprocessor.bert_pack_inputs,
                        arguments=dict(seq_length=txt_length))
encoder_inputs = bert_pack_inputs(tokenized_inputs)
encoder = hub.KerasLayer(TFHUB_ENCODER, trainable=True, name='BERT_encoder')
outputs = encoder(encoder_inputs)

multi_bert = outputs['pooled_output']
multi_bert = tf.keras.layers.Dropout(0.1)(multi_bert)
multi_bert = tf.keras.layers.Dense(num_classes, activation='softmax', name='classifier')(multi_bert)

return tf.keras.Model(text_input, multi_bert)

I'm no longer able to save the model. How can I increase the seq_length and save to model as .h5?

I also tried to call model.save('model'), which saves it as a pb file to load it again and save as .h5, but I've faced another issue while loading the saved model.

Relevant code

import argparse
import hashlib
import hmac
import h5py
import json
import os
from random import seed
from numpy import dtype
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text as text
from pathlib2 import Path
import matplotlib.pyplot as plt
import numpy as np
import sklearn

AUTOTUNE = tf.data.experimental.AUTOTUNE
SEED = 36

dset='dset'
epochs=1
batch_size=2
txt_length=256
learning_rate=0.0001
output='model'

# multi lingual bert model and corresponding encoder
TFHUB_ENCODER = 'https://tfhub.dev/tensorflow/bert_multi_cased_L-12_H-768_A-12/4'
TFHUB_PREPROCESSER = 'https://tfhub.dev/tensorflow/bert_multi_cased_preprocess/3'

text_input = [tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')]
preprocessor = hub.load(TFHUB_PREPROCESSER)
tokenize = hub.KerasLayer(preprocessor.tokenize)
tokenized_inputs = [tokenize(segment) for segment in text_input]
bert_pack_inputs = hub.KerasLayer(preprocessor.bert_pack_inputs,
                        arguments=dict(seq_length=txt_length))
encoder_inputs = bert_pack_inputs(tokenized_inputs)
encoder = hub.KerasLayer(TFHUB_ENCODER, trainable=True, name='BERT_encoder')
outputs = encoder(encoder_inputs)
multi_bert = outputs['pooled_output']
multi_bert = tf.keras.layers.Dropout(0.1)(multi_bert)
multi_bert = tf.keras.layers.Dense(num_classes, activation='softmax', name='classifier')(multi_bert)

model = tf.keras.Model(text_input, multi_bert)

raw_train_ds = tf.keras.preprocessing.text_dataset_from_directory(
    os.path.join(dset, 'train'),
    batch_size = batch_size,
    validation_split=0.2,
    subset='training',
    seed = SEED
)
class_names = raw_train_ds.class_names
train_ds = raw_train_ds.cache().prefetch(buffer_size=AUTOTUNE)

val_ds = tf.keras.preprocessing.text_dataset_from_directory(
        os.path.join(dset, 'train'),
        batch_size=batch_size,
        validation_split=0.2,
        subset='validation',
        seed=SEED
)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

test_ds = tf.keras.preprocessing.text_dataset_from_directory(
    os.path.join(dset, 'test'),
    batch_size=batch_size
)

classifier_model = build_multiclassifier_bert_model(len(class_names))

loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
metrics = tf.metrics.CategoricalCrossentropy()

classifier_model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['acc']
)

history = classifier_model.fit(
    x=train_ds,
    validation_data=val_ds,
    epochs=1
)

loss, accuracy = classifier_model.evaluate(test_ds)

tf.saved_model.save(classifier_model, str(output))

file_output = str(Path(output).joinpath('latest.h5'))
classifier_model.save(file_output)

Relevant log output

Traceback (most recent call last):
  File "/scripts/train.py", line 268, in <module>
    model_signature = run(**args)
  File "/scripts/train.py", line 190, in run
    classifier_model.save(file_output)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1974, in save
    signatures, options)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/save.py", line 131, in save_model
    model, filepath, overwrite, include_optimizer)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/hdf5_format.py", line 109, in save_model_to_hdf5
    model_metadata = saving_utils.model_metadata(model, include_optimizer)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saving_utils.py", line 157, in model_metadata
    raise e
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saving_utils.py", line 154, in model_metadata
    model_config['config'] = model.get_config()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py", line 598, in get_config
    return copy.deepcopy(get_network_config(self))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py", line 1278, in get_network_config
    layer_config = serialize_layer_fn(layer)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 250, in serialize_keras_object
    raise e
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 245, in serialize_keras_object
    config = instance.get_config()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_hub/keras_layer.py", line 332, in get_config
    "Got `type(handle)`: {}".format(type(self._handle)))
NotImplementedError: Can only generate a valid config for `hub.KerasLayer(handle, ...)`that uses a string `handle`.
Got `type(handle)`: <class 'tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject'>

tensorflow_hub Version

0.12.0 (latest stable release)

TensorFlow Version

other (please specify)

Other libraries

No response

Python Version

3.x

OS

Linux

pindinagesh commented 2 years ago

@MrMwalker2

Can you take a look at the workaround proposed in this link and see if it helps in resolving your issue? Also you can refer to this issue which discusses about similar problem. Thanks!

MrMwalker2 commented 2 years ago

@pindinagesh

Thanks for your reply. The later one is exactly what I was trying and it generates the error mentioned. The workaround isn't exactly what I'm looking for.

WGierke commented 2 years ago

Unfortunately, at this time we cannot provide more guidance than what was mentioned in https://github.com/tensorflow/hub/issues/845#issuecomment-1067822239. Feel free to re-open it if there are more insights or if you've found a workaround that works for you.