tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.68k stars 1.65k forks source link

example.tfrecords not displaying on What-If dashboard #1570

Open VinayTeki opened 5 years ago

VinayTeki commented 5 years ago

I've converted a keras model into tensorflow saved model using saved_model_builder.SavedModelBuilder(export_path) to be able to use is on whatif.

I've started a serving docker container docker run -p 8500:8500 --mount type=bind,source=/var/www/whatif_testing/model_name,target=/models/model_name -e MODEL_NAME=model_name -t tensorflow/serving

JPEG image data is converted into serialized tfrecords file using the below code.

def convert_to_record(train_addrs, train_labels, destination, keyword="train"):
    # address to save the TFRecords file
    train_filename = os.path.join(destination, keyword+'.tfrecords')
    # open the TFRecords file
    writer = tf.python_io.TFRecordWriter(train_filename)
    for i in range(len(train_addrs)):
        # print how many images are saved every 1000 images
        if not i % 10:
            sys.stdout.write(
                '{} data: {}/{}\r'.format(keyword, i, len(train_addrs)))
            sys.stdout.flush()
        # Load the image
        img = cv2.imread(train_addrs[i])
        img = cv2.resize(img, shape, interpolation=cv2.INTER_CUBIC)
        img = img.astype(np.uint8)/255.0

        label = train_labels[i]
        # Create a feature
        feature = {keyword+'/label': tf.train.Feature(int64_list=tf.train.Int64List(value=[label])),
                   keyword+'/image': tf.train.Feature(bytes_list=tf.train.BytesList(value=[tf.compat.as_bytes(img.tostring())])) }
        # Create an example protocol buffer
        example = tf.train.Example(features=tf.train.Features(feature=feature))

        # Serialize to string and write on the file
        writer.write(example.SerializeToString())

    writer.close()
    sys.stdout.flush()

This leads to an error from the model server

status = StatusCode.INVALID_ARGUMENT
    details = "Expects arg[0] to be double but string is provided"
    debug_error_string = "{"created":"@1541100531.185075510",
       "description":"Error received from peer",
       "file":"src/core/lib/surface/call.cc",
        "file_line":1017,
       "grpc_message":"Expects arg[0] to be double but string is provided",
       "grpc_status":3}"

I'm trying to understand what is the correct format of input image, input label and signature_def_map of the model

jameswex commented 5 years ago

@VinayTeki are you able to share the converted saved model and tfrecords file with me for debugging, or is the model/data private?

One note is that for the What-If Tool to recognize a bytes feature as being an image that it should display, the feature must be named "image/encoded" as per the documentation at https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/interactive_inference/README.md. But this won't cause the failure you are seeing. It would just lead to the what-if tool not displaying the image feature as an image visually.

VinayTeki commented 5 years ago

Sure @jameswex , Here is how i converted the keras model into tensorflow model. I'm guessing there is some issue with prediction signature.

def build_model():
    """Function returning keras model instance.

    Model can be
     - Trained here
     - Loaded with load_model
     - Loaded from keras.applications
    """

    # load model
    json_file = open(
        'model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    loaded_model = model_from_json(loaded_model_json)
    # print(loaded_model.summary())
    loaded_model.load_weights(
        "model_weights.h5")
    print("Loaded model from disk")

    return loaded_model

Resnet50model = build_model()

[master_test_tf_records.zip](https://github.com/tensorflow/tensorboard/files/2539941/master_test_tf_records.zip)

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import tag_constants, signature_constants, signature_def_utils_impl

prediction_signature = tf.saved_model.signature_def_utils.predict_signature_def(
    {"image": Resnet50model.input}, {"prediction": Resnet50model.output})

builder = saved_model_builder.SavedModelBuilder(export_path)
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')

init_op = tf.group(tf.global_variables_initializer(),
                   tf.local_variables_initializer())
sess.run(init_op)

builder.add_meta_graph_and_variables(
    sess, [tag_constants.SERVING],
    signature_def_map={
        signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
        prediction_signature,
    },
    legacy_init_op=legacy_init_op)
# save the graph
builder.save()

By the way changing the feature name to "image/encoded" changes the error message to

InvalidArgumentError (see above for traceback): Expected image (JPEG, PNG, or GIF), got unknown format starting with '\334\333\333\333\333\333\353?\234\233\233\233\233\233\353?'
     [[Node: while_11/DecodeJpeg = DecodeJpeg[acceptable_fraction=1, channels=3, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](while_11/strided_slice)]]

Also, I've changed the way images are loaded before converting them into tensoflow records. This made the images visible on the WIT dashboard. By there is a new error that showes up when I select a particular image.

    status = StatusCode.INVALID_ARGUMENT
    details = "input tensor alias not found in signature: image/encoded. Inputs expected to be in the set {image}."
    debug_error_string = "{"created":"@1541105947.000411456","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"input tensor alias not found in signature: image/encoded. Inputs expected to be in the set {image}.","grpc_status":3}"
jameswex commented 5 years ago

The tfrecords file in the zip contains tf.Examples with no empty feature lists, so perhaps you sent me the wrong tfrecords file. I'm glad with your loading and feature name changes that the images are now displaying in WIT. Would it be possible for you to zip up both the updated tfrecords file and the TF saved model directory outputted by your conversion code?

I think you are right that the issue is with the model's serving signature. WIT requires the model to use the Classification, Regression, or Predict API, as per the documentation at https://github.com/tensorflow/tensorboard/tree/master/tensorboard/plugins/interactive_inference#what-do-i-need-to-use-it.

The input to the model must be the serialized tf.Example proto, which the model can then deserialize and use individual features from (which is how the TF Classification and Regression APIs work). It looks like your model is expecting the input to be the image that is contained in one of the features in the example. So I think in order to use WIT on this model converted from keras, you will need to have the signature def take in the serialized tf.Example and the tf graph can have an operation that decodes that and then passes the image feature to the Resnet50model.input.

This is the first time I've seen someone try to use WIT with a model converted from keras to tf saved model, so we might have to work through the correct way to do this, but we'll make sure to document whatever needs to be done so it is simpler going forward.

I'll check with the tensorflow-serving folks to see if they have any input on converting a keras model to a TF saved model in a way that creates a model that takes in serialized tf.Examples as input.

jameswex commented 5 years ago

One thing to check out is https://www.tensorflow.org/versions/r1.12/api_docs/python/tf/contrib/saved_model/save_keras_model for saving your keras model as a saved model.

Then you candump out the input/output signatures via the saved_model_cli tool to check out if your model has a signature that accepts serialized examples as input.