aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
444 stars 148 forks source link

How use custom signature with inf1 trace? #821

Closed mostafafarzaneh closed 8 months ago

mostafafarzaneh commented 8 months ago

I'm trying to compile my model for inf1. The model input is (1920,1920,3). Here is the my code:

import os
import numpy as np
import tensorflow.neuron as tfn
import cv2

import syspath
import argparse
parser = argparse.ArgumentParser(description='Export model for inference')
parser.add_argument("--input", type=str, help='path to model hdf5 file')
parser.add_argument("--output", type=str, help='export path')
args = parser.parse_args()

import tensorflow as tf
from tensorflow_addons.layers import (
    GroupNormalization,
    InstanceNormalization,
)

def load_image(image_path):
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB
    image_array = image.astype(np.float32)  # Convert to float32
    image_array /= 255.0
    image_array = np.expand_dims(image_array, axis=0)
    example_input = tf.constant(image_array)
    return example_input

model = tf.keras.models.load_model(args.input, custom_objects={
    'tversky_loss': None,
    'loss_fn': None,
    'ssim': None
})

def __decode_images(images, nch):
    o = tf.vectorized_map(lambda x: tf.image.decode_jpeg(x, nch), images)
    o = tf.cast(o, dtype=tf.float16) / 255
    o = tf.reverse(o, axis=[-1])  # RGB2BGR
    return o

def __encode_images(images):
    images = tf.image.convert_image_dtype(images, tf.uint8, saturate=True)
    o = tf.vectorized_map(tf.image.encode_jpeg, images)
    return o

@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name='image')])
def serving(img):
    img = __decode_images(img, 3)
    o = model(img, training=False)
    o = 1.0 - o
    o = __encode_images(o)
    return {
        'output': o
    }

with open('we_01.jpg', 'rb') as file:
    # Read the binary content of the file
    sample_input = file.read()

model_neuron = tfn.trace(serving, tf.constant([sample_input]))
model_neuron.save(args.output)

But I got the following error:

ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.

mostafafarzaneh commented 8 months ago

I can compile my model and then apply the signature. Here is the code:

compile:

model = tf.keras.models.load_model(args.input, custom_objects={
    'tversky_loss': None,
    'loss_fn': None,
    'ssim': None
})

sample_input = load_image("we_01.jpg")
model_neuron = tfn.trace(model, sample_input)
model_neuron.save(args.output)

applying signature after compile:


import tensorflow as tf
model = tf.keras.models.load_model(args.input, custom_objects={
    'tversky_loss': None,
    'loss_fn': None,
    'ssim': None
})

def __decode_images(images, nch):
    o = tf.vectorized_map(lambda x: tf.image.decode_jpeg(x, nch), images)
    o = tf.cast(o, dtype=tf.float16) / 255
    o = tf.reverse(o, axis=[-1])  # RGB2BGR
    return o

def __encode_images(images):
    images = tf.image.convert_image_dtype(images, tf.uint8, saturate=True)
    o = tf.vectorized_map(tf.image.encode_jpeg, images)
    return o

@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name='image')])
def serving(img):
    img = __decode_images(img, 3)
    o = model(img, training=False)
    o = 1.0 - o
    o = __encode_images(o)
    return {
        'output': o
    }

tf.saved_model.save(model, export_dir=args.output, signatures=serving)

But the problem is I'm not getting performance improvement compared to CPU. I suspect that's because the encoding and decoding of JPEG is not compiled for Inf1.

mostafafarzaneh commented 8 months ago

Currently, I'm using the Elastic Inference eia2.large and the average inference time is 0.2 seconds. But with the inf1.xlarge the average inference time is 0.5 seconds. I get the same result (0.5 seconds) with CPU(6 core)

awsilya commented 8 months ago

@mostafafarzaneh yes encoding/decoding of JPEG cannot run on inf1. If that represents significant amount of the overall execution time you'll not benefit from running on inf1.

mostafafarzaneh commented 8 months ago

@awsilya I run another test without encoding/decoding JPEG by sending/receiving the prediction request with the raw image. I cannot see any improvement over the CPU. Currently, with my model, Elastic Inference does better.

Another question is, Is there any way to compile the custom signature instead of compiling the model and then applying the signature?

mostafafarzaneh commented 8 months ago

@awsilya I know that the issue has been closed, but if you could answer my question it would be much appreciated