tensorflow / hub

A library for transfer learning by reusing parts of TensorFlow models.
https://tensorflow.org/hub
Apache License 2.0
3.47k stars 1.66k forks source link

Bug: BERT preprocess load error #882

Closed Breezelled closed 1 year ago

Breezelled commented 1 year ago

What happened?

I'm trying to use the model from https://tfhub.dev/tensorflow/bert_en_cased_preprocess/3. While trying to train BERT.

Relevant code

def build_classifier_model():
    text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
    preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
    encoder_inputs = preprocessing_layer(text_input)
    encoder = hub.KerasLayer(tfhub_handle_encoder, trainable=True, name='BERT_encoder')
    outputs = encoder(encoder_inputs)
    net = outputs['pooled_output']
    net = tf.keras.layers.Dropout(0.1)(net)
    net = tf.keras.layers.Dense(1, activation=None, name='classifier')(net)
    return tf.keras.Model(text_input, net)

Relevant log output

Traceback (most recent call last):
  File "/Users/breeze/.../model_train.py", line 11, in <module>
    classifier_model = bm.build_classifier_model()
  File "/Users/breeze/.../build_model.py", line 99, in build_classifier_model
    preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
  File "/Users/breeze/.../Miniconda/envs/dl/lib/python3.9/site-packages/tensorflow_hub/keras_layer.py", line 157, in __init__
    self._func = load_module(handle, tags, self._load_options)
  File "/Users/breeze/.../Miniconda/envs/dl/lib/python3.9/site-packages/tensorflow_hub/keras_layer.py", line 459, in load_module
    return module_v2.load(handle, tags=tags, options=set_load_options)
  File "/Users/breeze/.../Miniconda/envs/dl/lib/python3.9/site-packages/tensorflow_hub/module_v2.py", line 107, in load
    raise ValueError("Trying to load a model of incompatible/unknown type. "
ValueError: Trying to load a model of incompatible/unknown type. '/var/folders/79/mpnlph953xb8wrtccgd52c400000gn/T/tfhub_modules/b514e0625a489a1a48844d70f33193aec2816f05' contains neither 'saved_model.pb' nor 'saved_model.pbtxt'.

tensorflow_hub Version

0.13.0.dev (unstable development build)

TensorFlow Version

2.8 (latest stable release)

Other libraries

tensorboard==2.12.1 tensorboard-data-server==0.7.0 tensorboard-plugin-wit==1.8.1 tensorflow-addons==0.20.0 tensorflow-datasets==4.9.0 tensorflow-estimator==2.12.0 tensorflow-hub==0.13.0 tensorflow-macos==2.12.0 tensorflow-metadata==1.13.0 tensorflow-metal==0.8.0 tensorflow-model-optimization==0.7.4 tensorflow-text==2.12.0

Python Version

3.x

OS

macOS

Breezelled commented 1 year ago

I trained and tested my code and model successfully in the previous days. But the error shows when I want to see the result today.

singhniraj08 commented 1 year ago

@Breezelled,

I was able to run your model building function with below setup and it ran successfully. Please find gist for reference.

Also, you can follow bert guide for compatible BERT preprocessing model and BERT encoder model. Please try running your code with below latest release versions and let us know if you face any challenges. Thank you!

Tensorflow Version: 2.11.1
Tensorflow Hub Version: 0.13.0
Tensorflow Text Version: 2.11.0
Breezelled commented 1 year ago

Hi! Thank you for your answer. But I still can not run the model with the same error. tfhub_handle_encoder: "https://tfhub.dev/tensorflow/bert_en_wwm_cased_L-24_H-1024_A-16/4" tfhub_handle_preprocess: "https://tfhub.dev/tensorflow/bert_en_cased_preprocess/3" It's already follow the bert guide for compatible BERT preprocessing model and BERT encoder model. However, I can train and test my model on my windows machine, but still can not run it on my M1 Max MacBook Pro Here is the code and commend line output, These files output information is about the Large Movie Review Dataset, But I believe it not the reason regarding this error. code:

def build_classifier_model():
    text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
    preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
    encoder_inputs = preprocessing_layer(text_input)
    encoder = hub.KerasLayer(tfhub_handle_encoder, trainable=True, name='BERT_encoder')
    outputs = encoder(encoder_inputs)
    net = outputs['pooled_output']
    net = tf.keras.layers.Dropout(0.1)(net)
    net = tf.keras.layers.Dense(1, activation=None, name='classifier')(net)
    return tf.keras.Model(text_input, net)

print("Tensorflow Version:", tf.__version__)
print("Tensorflow Hub Version:", hub.__version__)
print("Tensorflow Text Version:", text.__version__)

bert_model = build_classifier_model()
print(bert_model.summary())

output:

Found 25000 files belonging to 2 classes.
Using 20000 files for training.
Metal device set to: Apple M1 Max

systemMemory: 32.00 GB
maxCacheSize: 10.67 GB

Found 25000 files belonging to 2 classes.
Using 5000 files for validation.
Found 25000 files belonging to 2 classes.
BERT model selected           : https://tfhub.dev/tensorflow/bert_en_wwm_cased_L-24_H-1024_A-16/4
Preprocess model auto-selected: https://tfhub.dev/tensorflow/bert_en_cased_preprocess/3
Tensorflow Version: 2.12.0
Tensorflow Hub Version: 0.13.0
Tensorflow Text Version: 2.12.0
Traceback (most recent call last):
  File "/Users/breeze/.../model_test.py", line 3, in <module>
    from build_model import build_classifier_model
  File "/Users/breeze/.../build_model.py", line 112, in <module>
    bert_model = build_classifier_model()
  File "/Users/breeze/.../build_model.py", line 99, in build_classifier_model
    preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
  File "/Users/breeze/.../Miniconda/envs/dl/lib/python3.9/site-packages/tensorflow_hub/keras_layer.py", line 157, in __init__
    self._func = load_module(handle, tags, self._load_options)
  File "/Users/breeze/.../Miniconda/envs/dl/lib/python3.9/site-packages/tensorflow_hub/keras_layer.py", line 459, in load_module
    return module_v2.load(handle, tags=tags, options=set_load_options)
  File "/Users/breeze/.../Miniconda/envs/dl/lib/python3.9/site-packages/tensorflow_hub/module_v2.py", line 107, in load
    raise ValueError("Trying to load a model of incompatible/unknown type. "
ValueError: Trying to load a model of incompatible/unknown type. '/var/folders/79/mpnlph953xb8wrtccgd52c400000gn/T/tfhub_modules/b514e0625a489a1a48844d70f33193aec2816f05' contains neither 'saved_model.pb' nor 'saved_model.pbtxt'.
Breezelled commented 1 year ago

I can reload the previous trained model and successfully get the test result, and I can deploy and use the previous trained model on tensorflow-serving in a Docker container as well. However, I am unable to test the model from hub on my M1 Max MacBook. reload previous trained model:

import tensorflow as tf
import tensorflow_text as text
from build_model import build_classifier_model
reloaded_model = tf.saved_model.load("models/imdb_bert_wwm_batch12_lr3e-5_epoch2")

def print_my_examples(inputs, results):
    result_for_printing = \
        [f'input: {inputs[i]:<30} : score: {results[i][0]:.6f}'
         for i in range(len(inputs))]
    print(*result_for_printing, sep='\n')
    print()

# classifier_model = build_classifier_model()

examples = [
    "test sentence here.",
]

# sigmod mapping
reloaded_results = tf.sigmoid(reloaded_model(tf.constant(examples)))
# original_results = tf.sigmoid(classifier_model(tf.constant(examples)))

print('Results from the saved model:')
print_my_examples(examples, reloaded_results)
# print('Results from the model in memory:')
# print_my_examples(examples, original_results)
singhniraj08 commented 1 year ago

@Breezelled,

Can you navigate to '/var/folders/79/mpnlph953xb8wrtccgd52c400000gn/T/tfhub_modules/b514e0625a489a1a48844d70f33193aec2816f05' directory on you Mac machine and check for the saved model files manually. TF-Hub creates a temp directory to save the loaded models. However, after a few days or so, the contents of the temp folders(the loaded model) will be deleted. When you try to load the model again, it will look in the same temp directory but the files would have been deleted.

The workaround is to delete the temp folder(in your case b514e0625a489a1a48844d70f33193aec2816f05 folder) and run the code again. This will create a new directory to save model and load again later. The permanent solution is to download the model as .tar file and save it locally and provide the path while loading model. Hope this helps. Thank you!

# download and save model in temp model and load model from temp folder on 2nd and so run
bert_preprocess = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")

# download and save model .tar file and load model by providing local path
bert_preprocess = hub.KerasLayer("<saved model path>/bert_en_uncased_preprocess_3.tar")
Breezelled commented 1 year ago

This really works! Thank you! I check the file and folders under this path, '/var/folders/79/mpnlph953xb8wrtccgd52c400000gn/T/tfhub_modules/b514e0625a489a1a48844d70f33193aec2816f05'. They are empty indeed. Then, I successfully run my test after deleting those empty folders under '/var/folders/79/mpnlph953xb8wrtccgd52c400000gn/T/tfhub_modules'.

singhniraj08 commented 1 year ago

@Breezelled, Could you please close this issue if it's resolved. Thank you!

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No