Add option to optimize tf.math.reduce_prod to Myriad (OAK)

PINTO0309 commented 2 years ago

Issue Type

Feature Request

OS

Other

OS architecture

Other

Programming Language

Other

Framework

OpenVINO, Myriad Inference Engine

Download URL for tflite file

None

Convert Script

None

Description

Replace tf.math.reduce_prod with tf.math.multiply https://www.tensorflow.org/api_docs/python/tf/math/reduce_prod

https://github.com/PINTO0309/PINTO_model_zoo/tree/main/282_face_landmark_with_attention

https://github.com/iwatake2222/play_with_tflite/tree/master/pj_tflite_face_landmark_with_attention

Relevant Log Output

None

Source code for simple inference testing code

No response

PINTO0309 commented 2 years ago

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import tensorflow as tf
import numpy as np
from pprint import pprint
np.random.seed(0)

dummy_input = np.arange(1*16*16*2, dtype=np.int32).reshape([1,16,16,2])
pprint(dummy_input)

# Create a model
i = tf.keras.layers.Input(
    shape=[
        dummy_input.shape[1],
        dummy_input.shape[2],
        dummy_input.shape[3],
    ],
    batch_size=dummy_input.shape[0],
    dtype=tf.int32,
)

o = tf.math.reduce_prod(input_tensor=i, axis=3, keepdims=True)

model = tf.keras.models.Model(inputs=[i], outputs=[o])
model.summary()
output_path = 'saved_model'
tf.saved_model.save(model, output_path)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,
    tf.lite.OpsSet.SELECT_TF_OPS
]
tflite_model = converter.convert()
open(f"{output_path}/test.tflite", "wb").write(tflite_model)

# Float32
interpreter = tf.lite.Interpreter('saved_model/test.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input_details[0]['index'], dummy_input)
interpreter.invoke()
ret = interpreter.get_tensor(output_details[0]['index'])
print('@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Float32')
print(ret.shape)
pprint(ret)

input [1,16,16,2]

array([[[[  0,   1],
     [  2,   3],
     [  4,   5],
     [  6,   7],
     [  8,   9],
     [ 10,  11],
     [ 12,  13],
     [ 14,  15],
     [ 16,  17],
     [ 18,  19],
     [ 20,  21],
     [ 22,  23],
     [ 24,  25],
     [ 26,  27],
     [ 28,  29],
     [ 30,  31]],
:
    [[480, 481],
     [482, 483],
     [484, 485],
     [486, 487],
     [488, 489],
     [490, 491],
     [492, 493],
     [494, 495],
     [496, 497],
     [498, 499],
     [500, 501],
     [502, 503],
     [504, 505],
     [506, 507],
     [508, 509],
     [510, 511]]]], dtype=int32)

output [1, 16, 16, 1]

array([[[[     0],
     [     6],
     [    20],
     [    42],
     [    72],
     [   110],
     [   156],
     [   210],
     [   272],
     [   342],
     [   420],
     [   506],
     [   600],
     [   702],
     [   812],
     [   930]],
:
    [[230880],
     [232806],
     [234740],
     [236682],
     [238632],
     [240590],
     [242556],
     [244530],
     [246512],
     [248502],
     [250500],
     [252506],
     [254520],
     [256542],
     [258572],
     [260610]]]], dtype=int32)

KenjiAsaba commented 2 years ago

This is not a generalised solution, but a simple replacement of a part of the TransformTensorBilinear function I implemented for the face_landmark_with_attention model. https://github.com/KenjiAsaba/tflite2tensorflow/commit/9519eae1946907bc9ea168ccc97fe297d4604877

PINTO0309 commented 2 years ago

@KenjiAsaba I see. :smile: You are smart.

That is indeed a non-generalizable modification, but I will try your suggestion since the temporary purpose is OAK support for face attention. :+1:

PINTO0309 commented 2 years ago

LGTM.

Hmmm. How to repeat Multiply at a given axis. In other words, for axis=2, it is assumed that Multiply is repeated 16 times. This was the method I came up with last night, but I was worried that it would require a lot of multiplication.

In any case, I will apply the changes to face_attention, regenerate ONNX and attempt to convert to OAK. :smile_cat:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import tensorflow as tf
import numpy as np
from pprint import pprint
np.random.seed(0)

dummy_input = np.arange(1*16*16*2, dtype=np.int64).reshape([1,16,16,2])
pprint(dummy_input)

# Create a model
i = tf.keras.layers.Input(
    shape=[
        dummy_input.shape[1],
        dummy_input.shape[2],
        dummy_input.shape[3],
    ],
    batch_size=dummy_input.shape[0],
    dtype=tf.int64,
)

# o = tf.math.reduce_prod(input_tensor=i, axis=3, keepdims=True)
o = tf.math.multiply(i[:,:,:,0:1], i[:,:,:,1:2])

model = tf.keras.models.Model(inputs=[i], outputs=[o])
model.summary()
output_path = 'saved_model'
tf.saved_model.save(model, output_path)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,
    tf.lite.OpsSet.SELECT_TF_OPS
]
tflite_model = converter.convert()
open(f"{output_path}/test.tflite", "wb").write(tflite_model)

# Float32
interpreter = tf.lite.Interpreter('saved_model/test.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input_details[0]['index'], dummy_input)
interpreter.invoke()
ret = interpreter.get_tensor(output_details[0]['index'])
print(ret.shape)
pprint(ret)

input [1,16,16,2]

array([[[[  0,   1],
     [  2,   3],
     [  4,   5],
     [  6,   7],
     [  8,   9],
     [ 10,  11],
     [ 12,  13],
     [ 14,  15],
     [ 16,  17],
     [ 18,  19],
     [ 20,  21],
     [ 22,  23],
     [ 24,  25],
     [ 26,  27],
     [ 28,  29],
     [ 30,  31]],
:
    [[480, 481],
     [482, 483],
     [484, 485],
     [486, 487],
     [488, 489],
     [490, 491],
     [492, 493],
     [494, 495],
     [496, 497],
     [498, 499],
     [500, 501],
     [502, 503],
     [504, 505],
     [506, 507],
     [508, 509],
     [510, 511]]]])

output [1,16,16,1]

array([[[[     0],
     [     6],
     [    20],
     [    42],
     [    72],
     [   110],
     [   156],
     [   210],
     [   272],
     [   342],
     [   420],
     [   506],
     [   600],
     [   702],
     [   812],
     [   930]],
:
    [[230880],
     [232806],
     [234740],
     [236682],
     [238632],
     [240590],
     [242556],
     [244530],
     [246512],
     [248502],
     [250500],
     [252506],
     [254520],
     [256542],
     [258572],
     [260610]]]])

PINTO0309 commented 2 years ago

@KenjiAsaba You are a genius. It is now also executable with the OpenCV AI Kit (OAK).

python3 tflite2tensorflow.py \
--model_path face_landmark_with_attention.tflite \
--flatc_path ../flatc \
--schema_path ../schema.fbs \
--output_pb \
--optimizing_barracuda

python3 tflite2tensorflow.py \
--model_path face_landmark_with_attention.tflite \
--flatc_path ../flatc \
--schema_path ../schema.fbs \
--output_onnx \
--onnx_opset 11

$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model saved_model/model_float32.onnx \
--data_type FP32 \
--output_dir saved_model/openvino/FP32 \
--model_name face_landmark_with_attention_192x192

$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model saved_model/model_float32.onnx \
--data_type FP16 \
--output_dir saved_model/openvino/FP16 \
--model_name face_landmark_with_attention_192x192

mkdir -p saved_model/openvino/myriad
${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/lib/intel64/myriad_compile \
-m saved_model/openvino/FP16/face_landmark_with_attention_192x192.xml \
-ip U8 \
-VPU_NUMBER_OF_SHAVES 4 \
-VPU_NUMBER_OF_CMX_SLICES 4 \
-o saved_model/openvino/myriad/face_landmark_with_attention_192x192.blob

Inference Engine: 
    IE version ......... 2021.4.0
    Build ........... 2021.4.0-3839-cd81789d294-releases/2021/4
[Warning][VPU][Config] Deprecated option was used : VPU_MYRIAD_PLATFORM
Done

PINTO0309 commented 2 years ago

Released: v1.20.8 https://github.com/PINTO0309/tflite2tensorflow/releases/tag/v1.20.8

@KenjiAsaba Thank you so much for your cooperation!

KenjiAsaba commented 2 years ago

It's my pleasure!

PINTO0309 / tflite2tensorflow