tensorflow / hub

A library for transfer learning by reusing parts of TensorFlow models.
https://tensorflow.org/hub
Apache License 2.0
3.48k stars 1.66k forks source link

Unable to load saved Keras model with hub.KerasLayer during Inference in jupyter notebook(AWS SageMaker) #687

Closed kirankunapuli closed 3 years ago

kirankunapuli commented 4 years ago

Issue:

I tried loading a saved Keras model which consists of hub.KerasLayer with universal-sentence-encoder-multilingual-large which was saved during SageMaker training job. There are no errors during training, the training code used is given below. Dockerfile used to create the instance is given below.

But I am unable to load it using load_model("model.h5", custom_objects={"KerasLayer":hub.KerasLayer}) when trying in Jupyter notebook (SageMaker notebook), as it is looking for a hard-coded OS path during loading. The error and prediction code used are given below.

Error:

OSError: /opt/ml/code/google-use-multi-large/ does not exist.

I created /opt/ml/code/google-use-multi-large/ and ~/opt/ml/code/google-use-multi-large/ locally in the notebook environment, but the same error occurs.

Click to expand: Dockerfile ``` Dockerfile FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:2.3.0-gpu-py37-cu102-ubuntu18.04 RUN mkdir -p /opt/ml/code WORKDIR /opt/ml/code RUN mkdir google-use-multi RUN wget "https://storage.googleapis.com/tfhub-modules/google/universal-sentence-encoder-multilingual-large/3.tar.gz" && \ tar -zxvf 3.tar.gz --directory google-use-multi-large/ ENV HUB_PATH='/opt/ml/code/google-use-multi/' ```
Click to expand: Training Code ``` python import tensorflow as tf import tensorflow_text import tensorflow_hub as hub module_url = os.environ["HUB_PATH"] def build_model(module_url, num_classes): inputs = Input(shape=(1,), dtype=tf.string) embed_hublayer = hub.KerasLayer( module_url, input_shape=[], dtype=tf.string, trainable=False, name="USE_embedding", ) embedding = embed_hublayer(tf.squeeze(tf.cast(inputs, tf.string))) x = Dense(256, activation="relu")(embedding) x = Dropout(0.3)(x) outputs = Dense(num_classes, activation="softmax")(x) model = Model(inputs=inputs, outputs=outputs) model.compile( loss="sparse_categorical_crossentropy", optimizer="adam", metrics=["accuracy"] ) logger.info(model.summary()) return model model = build_model(module_url=module_url, num_classes=num_classes) model.fit(x_train, y_train, batch_size=batch_size, epochs=args.epochs, callbacks=callbacks_list, validation_data=validation_data, shuffle=True, verbose=2) model.save("model.h5") ```
Click to expand: Prediction Code ``` python import tensorflow as tf import tensorflow_text import tensorflow_hub as hub from tensorflow.keras.models import load_model model = load_model("model.h5", custom_objects={"KerasLayer":hub.KerasLayer}) ```
Click to expand: Full Error Log ``` python --------------------------------------------------------------------------- OSError Traceback (most recent call last) in ----> 1 load_model(f'./model-mllen2/model.h5', custom_objects={'KerasLayer': hub.KerasLayer}) ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py in load_model(filepath, custom_objects, compile, options) 180 if (h5py is not None and ( 181 isinstance(filepath, h5py.File) or h5py.is_hdf5(filepath))): --> 182 return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile) 183 184 filepath = path_to_string(filepath) ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile) 176 model_config = json.loads(model_config.decode('utf-8')) 177 model = model_config_lib.model_from_config(model_config, --> 178 custom_objects=custom_objects) 179 180 # set weights ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/saving/model_config.py in model_from_config(config, custom_objects) 53 '`Sequential.from_config(config)`?') 54 from tensorflow.python.keras.layers import deserialize # pylint: disable=g-import-not-at-top ---> 55 return deserialize(config, custom_objects=custom_objects) 56 57 ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py in deserialize(config, custom_objects) 173 module_objects=LOCAL.ALL_OBJECTS, 174 custom_objects=custom_objects, --> 175 printable_module_name='layer') ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name) 356 custom_objects=dict( 357 list(_GLOBAL_CUSTOM_OBJECTS.items()) + --> 358 list(custom_objects.items()))) 359 with CustomObjectScope(custom_objects): 360 return cls.from_config(cls_config) ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py in from_config(cls, config, custom_objects) 615 """ 616 input_tensors, output_tensors, created_layers = reconstruct_from_config( --> 617 config, custom_objects) 618 model = cls(inputs=input_tensors, outputs=output_tensors, 619 name=config.get('name')) ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py in reconstruct_from_config(config, custom_objects, created_layers) 1202 # First, we create all layers and enqueue nodes to be processed 1203 for layer_data in config['layers']: -> 1204 process_layer(layer_data) 1205 # Then we process nodes in order of layer depth. 1206 # Nodes that cannot yet be processed (if the inbound node ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py in process_layer(layer_data) 1184 from tensorflow.python.keras.layers import deserialize as deserialize_layer # pylint: disable=g-import-not-at-top 1185 -> 1186 layer = deserialize_layer(layer_data, custom_objects=custom_objects) 1187 created_layers[layer_name] = layer 1188 ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py in deserialize(config, custom_objects) 173 module_objects=LOCAL.ALL_OBJECTS, 174 custom_objects=custom_objects, --> 175 printable_module_name='layer') ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name) 358 list(custom_objects.items()))) 359 with CustomObjectScope(custom_objects): --> 360 return cls.from_config(cls_config) 361 else: 362 # Then `cls` may be a function returning a class. ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in from_config(cls, config) 695 A layer instance. 696 """ --> 697 return cls(**config) 698 699 def compute_output_shape(self, input_shape): ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_hub/keras_layer.py in __init__(self, handle, trainable, arguments, _sentinel, tags, signature, signature_outputs_as_dict, output_key, output_shape, load_options, **kwargs) 158 159 self._load_options = load_options --> 160 self._func = load_module(handle, tags, self._load_options) 161 self._has_training_argument = func_has_training_argument(self._func) 162 self._is_hub_module_v1 = getattr(self._func, "_is_hub_module_v1", False) ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_hub/keras_layer.py in load_module(handle, tags, load_options) 427 except ImportError: 428 set_load_options = load_options --> 429 return module_v2.load(handle, tags=tags, options=set_load_options) 430 431 ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_hub/module_v2.py in load(handle, tags, options) 99 if not isinstance(handle, six.string_types): 100 raise ValueError("Expected a string, got %s" % handle) --> 101 module_path = resolve(handle) 102 is_hub_module_v1 = tf.io.gfile.exists( 103 native_module.get_module_proto_path(module_path)) ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_hub/module_v2.py in resolve(handle) 51 A string representing the Module path. 52 """ ---> 53 return registry.resolver(handle) 54 55 ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_hub/registry.py in __call__(self, *args, **kwargs) 42 for impl in reversed(self._impls): 43 if impl.is_supported(*args, **kwargs): ---> 44 return impl(*args, **kwargs) 45 else: 46 logging.info("%s %s does not support the provided handle.", self._name, ~/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_hub/resolver.py in __call__(self, handle) 494 def __call__(self, handle): 495 if not tf_v1.gfile.Exists(handle): --> 496 raise IOError("%s does not exist." % handle) 497 return handle OSError: /opt/ml/code/google-use-multi-large/ does not exist. ```

Version Info:

Python 3.7 while training
Python 3.6 while loading model in SageMaker notebook
Package versions are the same in both envs

tensorflow==2.3.0
tensorflow-hub==0.9.0
tensorflow-text==2.3.0

Would like to know how can I overcome this error of hard-coded OS path and load the saved Keras model in a new environment other than where it was trained such as in a SageMaker notebook/Google Colab notebook or how can I modify build_model function in training script?

akhorlin commented 3 years ago

We are not experts in the SageMaker environment. In case of the Google Colab, there are couple of work-arounds: (1) You can use S3/GCS for storage of files/assets. For TF Hub, you can location where temporary files are stored using TFHUB_CACHE_DIR, (2) another options is described here