tensorflow / hub

A library for transfer learning by reusing parts of TensorFlow models.
https://tensorflow.org/hub
Apache License 2.0
3.49k stars 1.67k forks source link

Bug: How to Load a Custom Model? #857

Closed tessgadwa closed 2 years ago

tessgadwa commented 2 years ago

What happened?

I was able to train and save a custom model using the make_image_classifier tool provided at https://github.com/tensorflow/hub/tree/master/tensorflow_hub/tools/make_image_classifier.

I was then able to successfully load this custom model to a Tensorflow Serving localhost.

However, I could not find any way to effectively test my custom model with new images. Because I am using a Mac with M1 chip, deploying to TF Lite is not an option. This tutorial runs fine on my local machine, but swapping in the custom model URL absolutely does not work!

It appears that I am missing a critical step. Can you provide any guidance on what is?

Relevant code

def build_dataset(subset):
  return tf.keras.preprocessing.image_dataset_from_directory(
      data_dir,
      validation_split=.20,
      subset=subset,
      label_mode="categorical",
      # Seed needs to provided when using validation_split and shuffle = True.
      # A fixed seed is used so that the validation set is stable across runs.
      seed=123,
      image_size=IMAGE_SIZE,
      batch_size=1)

train_ds = build_dataset("training")
class_names = tuple(train_ds.class_names)
train_size = train_ds.cardinality().numpy()
train_ds = train_ds.unbatch().batch(BATCH_SIZE)
train_ds = train_ds.repeat()

normalization_layer = tf.keras.layers.Rescaling(1. / 255)
preprocessing_model = tf.keras.Sequential([normalization_layer])
do_data_augmentation = False #@param {type:"boolean"}
if do_data_augmentation:
  preprocessing_model.add(
      tf.keras.layers.RandomRotation(40))
  preprocessing_model.add(
      tf.keras.layers.RandomTranslation(0, 0.2))
  preprocessing_model.add(
      tf.keras.layers.RandomTranslation(0.2, 0))
  # Like the old tf.keras.preprocessing.image.ImageDataGenerator(),
  # image sizes are fixed when reading, and then a random zoom is applied.
  # If all training inputs are larger than image_size, one could also use
  # RandomCrop with a batch size of 1 and rebatch later.
  preprocessing_model.add(
      tf.keras.layers.RandomZoom(0.2, 0.2))
  preprocessing_model.add(
      tf.keras.layers.RandomFlip(mode="horizontal"))
train_ds = train_ds.map(lambda images, labels:
                        (preprocessing_model(images), labels))

val_ds = build_dataset("validation")
valid_size = val_ds.cardinality().numpy()
val_ds = val_ds.unbatch().batch(BATCH_SIZE)
val_ds = val_ds.map(lambda images, labels:
                    (normalization_layer(images), labels))

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
Input In [5], in <cell line: 13>()
      1 def build_dataset(subset):
      2   return tf.keras.preprocessing.image_dataset_from_directory(
      3       data_dir,
      4       validation_split=.20,
   (...)
     10       image_size=IMAGE_SIZE,
     11       batch_size=1)
---> 13 train_ds = build_dataset("training")
     14 class_names = tuple(train_ds.class_names)
     15 train_size = train_ds.cardinality().numpy()

Input In [5], in build_dataset(subset)
      1 def build_dataset(subset):
----> 2   return tf.keras.preprocessing.image_dataset_from_directory(
      3       data_dir,
      4       validation_split=.20,
      5       subset=subset,
      6       label_mode="categorical",
      7       # Seed needs to provided when using validation_split and shuffle = True.
      8       # A fixed seed is used so that the validation set is stable across runs.
      9       seed=123,
     10       image_size=IMAGE_SIZE,
     11       batch_size=1)

File /opt/homebrew/Caskroom/miniforge/base/envs/mlp/lib/python3.9/site-packages/keras/utils/image_dataset.py:192, in image_dataset_from_directory(directory, labels, label_mode, class_names, color_mode, batch_size, image_size, shuffle, seed, validation_split, subset, interpolation, follow_links, crop_to_aspect_ratio, **kwargs)
    190 if seed is None:
    191   seed = np.random.randint(1e6)
--> 192 image_paths, labels, class_names = dataset_utils.index_directory(
    193     directory,
    194     labels,
    195     formats=ALLOWLIST_FORMATS,
    196     class_names=class_names,
    197     shuffle=shuffle,
    198     seed=seed,
    199     follow_links=follow_links)
    201 if label_mode == 'binary' and len(class_names) != 2:
    202   raise ValueError(
    203       f'When passing `label_mode="binary"`, there must be exactly 2 '
    204       f'class_names. Received: class_names={class_names}')

File /opt/homebrew/Caskroom/miniforge/base/envs/mlp/lib/python3.9/site-packages/keras/utils/dataset_utils.py:66, in index_directory(directory, labels, formats, class_names, shuffle, seed, follow_links)
     64 else:
     65   subdirs = []
---> 66   for subdir in sorted(tf.io.gfile.listdir(directory)):
     67     if tf.io.gfile.isdir(tf.io.gfile.join(directory, subdir)):
     68       if subdir.endswith('/'):

File /opt/homebrew/Caskroom/miniforge/base/envs/mlp/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py:766, in list_directory_v2(path)
    751 """Returns a list of entries contained within a directory.
    752 
    753 The list is in arbitrary order. It does not contain the special entries "."
   (...)
    763   errors.NotFoundError if directory doesn't exist
    764 """
    765 if not is_directory(path):
--> 766   raise errors.NotFoundError(
    767       node_def=None,
    768       op=None,
    769       message="Could not find directory {}".format(path))
    771 # Convert each element to string, since the return values of the
    772 # vector of string should be interpreted as strings, not bytes.
    773 return [
    774     compat.as_str_any(filename)
    775     for filename in _pywrap_file_io.GetChildren(compat.path_to_bytes(path))
    776 ]

NotFoundError: Could not find directory /Users/tessgadwa/.keras/datasets/banner-images

Relevant log output

TensorFlow Serving Output:

tessgadwa@Tesss-MacBook-Air tensorflow-serving-arm % docker run -t --rm -p 8501:8501 --mount type=bind,source=/Users/tessgadwa/Dev/tensorflow-serving-arm/banner_model/,target=/models/banner_model/ -e MODEL_NAME=banner_model emacski/tensorflow-serving:latest-linux_arm64
2022-07-15 02:13:43.017737: I external/tf_serving/tensorflow_serving/model_servers/server.cc:89] Building single TensorFlow model file config:  model_name: banner_model model_base_path: /models/banner_model
2022-07-15 02:13:43.017894: I external/tf_serving/tensorflow_serving/model_servers/server_core.cc:465] Adding/updating models.
2022-07-15 02:13:43.017914: I external/tf_serving/tensorflow_serving/model_servers/server_core.cc:591]  (Re-)adding model: banner_model
2022-07-15 02:13:43.130409: I external/tf_serving/tensorflow_serving/core/basic_manager.cc:740] Successfully reserved resources to load servable {name: banner_model version: 1}
2022-07-15 02:13:43.130433: I external/tf_serving/tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: banner_model version: 1}
2022-07-15 02:13:43.130441: I external/tf_serving/tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: banner_model version: 1}
2022-07-15 02:13:43.131187: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:38] Reading SavedModel from: /models/banner_model/1
2022-07-15 02:13:43.187445: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:90] Reading meta graph with tags { serve }
2022-07-15 02:13:43.187479: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /models/banner_model/1
2022-07-15 02:13:43.190141: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2022-07-15 02:13:43.248704: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:211] Restoring SavedModel bundle.
2022-07-15 02:13:43.253234: W external/org_tensorflow/tensorflow/core/platform/profile_utils/cpu_utils.cc:87] Failed to get CPU frequency: -1
2022-07-15 02:13:43.675404: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:195] Running initialization op on SavedModel bundle at path: /models/banner_model/1
2022-07-15 02:13:43.705190: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:283] SavedModel load for tags { serve }; Status: success: OK. Took 574001 microseconds.
2022-07-15 02:13:43.709474: I external/tf_serving/tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc:59] No warmup data file found at /models/banner_model/1/assets.extra/tf_serving_warmup_requests
2022-07-15 02:13:43.713761: I external/tf_serving/tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: banner_model version: 1}
2022-07-15 02:13:43.714628: I external/tf_serving/tensorflow_serving/model_servers/server_core.cc:486] Finished adding/updating models
2022-07-15 02:13:43.714707: I external/tf_serving/tensorflow_serving/model_servers/server.cc:133] Using InsecureServerCredentials
2022-07-15 02:13:43.714740: I external/tf_serving/tensorflow_serving/model_servers/server.cc:383] Profiler service is enabled
2022-07-15 02:13:43.716046: I external/tf_serving/tensorflow_serving/model_servers/server.cc:409] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2022-07-15 02:13:43.718509: I external/tf_serving/tensorflow_serving/model_servers/server.cc:430] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 245] NET_LOG: Entering the event loop ...

tensorflow_hub Version

0.12.0 (latest stable release)

TensorFlow Version

other (please specify)

Other libraries

I am running TF version: 2.9.2 [Drop-down above would not let me add the correct version.]

I am also using tensorflow-macos and tensorflow-metal to provide support for Apple's ARM64 M1 chip.

I converted my most recent Jupyter notebook (based on the flowers classification example at https://www.tensorflow.org/tutorials/images/classification) to an .MD text document and attached the file, in case this is helpful.

tf2_image_retraining.md

Python Version

3.x

OS

macOS

akhorlin commented 2 years ago

The error Could not find directory /Users/tessgadwa/.keras/datasets/banner-images is thrown from the code that tries list images, image_paths, labels, class_names = dataset_utils.index_directory(). It seems there is an issue with the input path for the script.

tessgadwa commented 2 years ago

Thank you for pointing me in the right direction.