Do anyone tried to convert trained .h5 model to Tensorflow Lite or Tensorflow Serving?

marco8chong commented 5 years ago

I got some difficulties due to some python functions such as 'yolo_loss' and 'yolo_head'.

BenSchl commented 5 years ago

The last days I tried to convert a tiny model to Tensorflow Lite.

In order to overcome the issue with 'yolo_loss' and 'yolo_head', I used the model created by tiny_yolo_body (and loaded my weights).

Because of over issues, the input had to be fixed (in my case to (416, 416, 3)).

Again some error occured and forced me to install tf-nightly (without gpu) in a different v.-env. . Now I'm stuck with the error F tensorflow/lite/toco/graph_transformations/propagate_fixed_sizes.cc:117] Check failed: dim_x == dim_y (16 vs. 32)Dimensions must match . I have no idea how to fix it.

marco8chong commented 5 years ago

I tried to create the model like this:

yolo_model = YOLO(**vars(FLAGS)).yolo_model yolo_model.summary()

And convert the model using the following code:

def save_model_for_production(model, version, path='prod_models'):

tf.keras.backend.set_learning_phase(1)
if not os.path.exists(path):
    os.mkdir(path)
export_path = os.path.join(
    tf.compat.as_bytes(path),
    tf.compat.as_bytes(version))
builder = tf.saved_model.builder.SavedModelBuilder(export_path)

model_input = tf.saved_model.utils.build_tensor_info(model.input)
model_output = tf.saved_model.utils.build_tensor_info(model.output)

prediction_signature = (
    tf.saved_model.signature_def_utils.build_signature_def(
        inputs={'inputs': model_input},
        outputs={'output': model_output},
        method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

with tf.keras.backend.get_session() as sess:
    builder.add_meta_graph_and_variables(
        sess=sess, tags=[tf.saved_model.tag_constants.SERVING],
        signature_def_map={
            'predict':
                prediction_signature,
        })

    builder.save()

However, I got the following error:

model_output = tf.saved_model.utils.build_tensor_info(model.output)

Exception has occurred: AttributeError 'list' object has no attribute 'dtype' File "D:\tensorflow\keras-yolo3-master\model_conversion.py", line 22, in save_model_for_production model_output = tf.saved_model.utils.build_tensor_info(model.output) File "D:\tensorflow\keras-yolo3-master\model_conversion.py", line 90, in save_model_for_production(yolo_model, "1", export_path)

BenSchl commented 5 years ago

The error is a result of the line model_output = tf.saved_model.utils.build_tensor_info(model.output). The YOLO models have multiple outputs and you need to handle each output separately (and adjust the signature ('output1': model_output[0], 'output2': model_output[1], ...)). I tried the modified code and faced other errors which I'm not able to fix (tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value conv2d_9/kernel [[node conv2d_9/kernel (defined at C:\venv4\lib\site-packages\keras\backend\tensorflow_backend.py:402) ]] [[batch_normalization_9/beta/_161]]).

Update:

If the model is compiled, the error changes to tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value batch_normalization_10/beta [[{{node batch_normalization_10/beta}}]] [[conv2d_6/kernel/_209]].

(Note: I'm using the tiny model).

marco8chong commented 5 years ago

I tried Flask today, which is much more easier to setup.

marco8chong commented 5 years ago

I found that there is no 'Attempting to use uninitialized' error if I call the following function first: sess.run(tf.global_variables_initializer())

cguentherTUChemnitz commented 5 years ago

I found that there is no 'Attempting to use uninitialized' error if I call the following function first: sess.run(tf.global_variables_initializer())

@marco8chong So you were able to get rid of all error messages? Can you please provide the script you finally used?

I am not able to understand this:

I tried to create the model like this:

yolo_model = YOLO(**vars(FLAGS)).yolo_model yolo_model.summary()

What are the FLAGS and how did you provide the already trained data from the h5 file?

cguentherTUChemnitz commented 5 years ago

I added in train.py a model.save(log_dir + 'trained_model_final.h5') next to the model.save_weights statement. I moved this result to model_data/tf.h5

I try to load the dataset to be able to convert it with your script snipped as follows:

# https://github.com/qqwweee/keras-yolo3/issues/344

import tensorflow as tf
import os
from keras.models import load_model

#https://github.com/qqwweee/keras-yolo3/issues/48#issuecomment-457440948
from yolo3.model import yolo_head

def _main():
    path = "model_data/"

    model = load_model(os.path.join(path, 'tf.h5'), {'yolo_head': yolo_head})
    tf.keras.backend.set_learning_phase(1)
    if not os.path.exists(path):
        os.mkdir(path)
    export_path = os.path.join(tf.compat.as_bytes(path))
    builder = tf.saved_model.builder.SavedModelBuilder(export_path)

    model_input = tf.saved_model.utils.build_tensor_info(model.input)
    model_output = tf.saved_model.utils.build_tensor_info(model.output)

    prediction_signature = (
        tf.saved_model.signature_def_utils.build_signature_def(
            inputs={'inputs': model_input},
            outputs={'output': model_output},
            method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

    with tf.keras.backend.get_session() as sess:
        sess.run(tf.global_variables_initializer())
        builder.add_meta_graph_and_variables(
            sess=sess, tags=[tf.saved_model.tag_constants.SERVING],
            signature_def_map={
                'predict':
                    prediction_signature,
            })

        builder.save()

if __name__ == '__main__':
    _main()

But i get the error:

  File "<PathToTheProject>keras-yolo3/yolo3/model.py", line 376, in yolo_loss
    anchors[anchor_mask[l]], num_classes, input_shape, calc_loss=True)
TypeError: list indices must be integers or slices, not list

Can anyone guide me the last steps here?

marco8chong commented 5 years ago

I tried to convert the model to Tensorflow Serving like this, the code can run but obviously I missed some parts too.

import sys
import argparse
import tensorflow as tf
import os
from yolo import YOLO
from keras import backend as K
from keras.models import Model

tf.logging.set_verbosity(tf.logging.INFO)

FLAGS = None

def save_model_for_production(model, version, path='prod_models'):
    tf.keras.backend.set_learning_phase(1)
    if not os.path.exists(path):
        os.mkdir(path)
    export_path = os.path.join(
        tf.compat.as_bytes(path),
        tf.compat.as_bytes(version))
    builder = tf.saved_model.builder.SavedModelBuilder(export_path)

    model_input = tf.saved_model.utils.build_tensor_info(model.input)
    model_output0 = tf.saved_model.utils.build_tensor_info(model.output[0])
    model_output1 = tf.saved_model.utils.build_tensor_info(model.output[1])
    model_output2 = tf.saved_model.utils.build_tensor_info(model.output[2])

    prediction_signature = (
        tf.saved_model.signature_def_utils.build_signature_def(
            inputs={tf.saved_model.signature_constants.PREDICT_INPUTS: model_input},
            outputs={tf.saved_model.signature_constants.PREDICT_OUTPUTS+'0': model_output0,
                     tf.saved_model.signature_constants.PREDICT_OUTPUTS+'1': model_output1,
                     tf.saved_model.signature_constants.PREDICT_OUTPUTS+'2': model_output2},
            method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

    with tf.keras.backend.get_session() as sess:
        # 
        sess.run(tf.global_variables_initializer())
        #
        builder.add_meta_graph_and_variables(
            sess=sess, tags=[tf.saved_model.tag_constants.SERVING],
            signature_def_map={
                'predict':
                    prediction_signature,
            })

        builder.save()

if __name__ == '__main__':
    # class YOLO defines the default value, so suppress any default here
    parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
    '''
    Command line options
    '''
    parser.add_argument(
        '--model', type=str,
        help='path to model weight file, default ' + YOLO.get_defaults("model_path")
    )

    parser.add_argument(
        '--anchors', type=str,
        help='path to anchor definitions, default ' + YOLO.get_defaults("anchors_path")
    )

    parser.add_argument(
        '--classes', type=str,
        help='path to class definitions, default ' + YOLO.get_defaults("classes_path")
    )

    parser.add_argument(
        '--gpu_num', type=int,
        help='Number of GPU to use, default ' + str(YOLO.get_defaults("gpu_num"))
    )

    parser.add_argument(
        '--image', default=False, action="store_true",
        help='Image detection mode, will ignore all positional arguments'
    )
    '''
    Command line positional arguments -- for video detection mode
    '''
    parser.add_argument(
        "--input", nargs='?', type=str,required=False,default='./path2your_video',
        help = "Video input path"
    )

    parser.add_argument(
        "--output", nargs='?', type=str, default="",
        help = "[Optional] Video output path"
    )

    FLAGS = parser.parse_args()

    yolo = YOLO(**vars(FLAGS))
    yolo_model = yolo.yolo_model
    yolo_model.summary()

    export_path = "tf-model-serving"
    save_model_for_production(yolo_model, "1", export_path)

cguentherTUChemnitz commented 5 years ago

@marco8chong Thanks a lot for the code snipped.

That helped me to see how i load the trained keras weights as a model. I was able to run your script. In case of the yolo-tiny model, i got the problem that model.output[2] was out of range. I removed this output accordingly. Afterwards i was able to use your script to generate the pb, data and index file. Nevertheless i was not able to convert them easily into a tflite file.

I found this yolo3 project, which exports tflite directly: https://github.com/benjamintanweihao/YOLOv3/blob/master/tflite.py. Merging your dataset-load with his conversion step produced me finally a tflite file :+1: I have to clean it up a lot and check if the generated file does really do the job, but i think this could be the solution.

cguentherTUChemnitz commented 5 years ago

Here you get the script snipped i used for conversion. This must be placed as an python file directly in the root directory of the keras-yolo3 project:

# https://github.com/qqwweee/keras-yolo3/issues/344
from yolo import YOLO

import argparse
from tensorflow.contrib.lite.python import lite

def _main():
    # class YOLO defines the default value, so suppress any default here
    parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
    '''
    Command line options
    '''
    parser.add_argument(
        '--model', type=str,
        help='path to model weight file, default ' + YOLO.get_defaults("model_path")
    )

    parser.add_argument(
        '--anchors', type=str,
        help='path to anchor definitions, default ' + YOLO.get_defaults("anchors_path")
    )

    parser.add_argument(
        '--classes', type=str,
        help='path to class definitions, default ' + YOLO.get_defaults("classes_path")
    )

    parser.add_argument(
        '--gpu_num', type=int,
        help='Number of GPU to use, default ' + str(YOLO.get_defaults("gpu_num"))
    )

    parser.add_argument(
        '--image', default=False, action="store_true",
        help='Image detection mode, will ignore all positional arguments'
    )
    '''
    Command line positional arguments -- for video detection mode
    '''
    parser.add_argument(
        "--input", nargs='?', type=str, required=False, default='./path2your_video',
        help="Video input path"
    )

    parser.add_argument(
        "--output", nargs='?', type=str, default="",
        help="[Optional] Video output path"
    )

    FLAGS = parser.parse_args()

    yolo = YOLO(**vars(FLAGS))
    yolo_model = yolo.yolo_model
    yolo_model.summary()

    keras_model_path = "model_data/tf.h5"
    yolo_model.save(keras_model_path)
    converter = lite.TFLiteConverter.from_keras_model_file(
        keras_model_path, input_shapes={'input_1': [1, YOLO.get_defaults("model_image_size")[0],
                                                    YOLO.get_defaults("model_image_size")[1], 3]})

    converter.post_training_quantize = True
    tflite_model = converter.convert()
    open("model_data/tf.tflite", "wb").write(tflite_model)

if __name__ == '__main__':
    _main()

marco8chong commented 5 years ago

Now my problem is the missing Yolo Head

cguentherTUChemnitz commented 5 years ago

Now my problem is the missing Yolo Head

This might help you: https://github.com/qqwweee/keras-yolo3/issues/349#issuecomment-486345613

Nevertheless, the script i posted did not need any further import, when placed as a sibling to yolo.py

marco8chong commented 5 years ago

Actually my problem is that the outputs of the converted model are 3 convolution layers, and I have no idea about how to create detected boxes, scores and classes.

BenSchl commented 5 years ago

The output is described in the two following papers: https://pjreddie.com/media/files/papers/YOLO9000.pdf https://pjreddie.com/media/files/papers/YOLOv3.pdf

The yolo_head function in model.py is implemented as explained in the papers. Here is an example of an implementation in Kotlin.

Here is my implementation (for Tiny Yolo, so there are a 2 outputs, you need 3) in Java:

    float[][][][] output1 = new float[1][outputSizes[0]][outputSizes[0]][(numberOfClasses + 5) * 3];
    float[][][][] output2 = new float[1][outputSizes[1]][outputSizes[1]][(numberOfClasses + 5) * 3];

    outputs.put(0, output1);
    outputs.put(1, output2);

    List<Detection> results = new LinkedList<>();

    tf.runForMultipleInputsOutputs(inputs, outputs);

    float[][][][] active = output1;

    int offset;

    float confidence, x, y, w, h,
            maxClass;

    int objectId;

    float[] confidenceClasses = new float[numberOfClasses];

    /*
     * Based on: J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv, 2018. 4
     */
    for(int i=0;i<2;++i) {
        for(int j=0;j<outputSizes[i];++j)
            for(int k=0;k<outputSizes[i];++k)
                for(int l=0;l<3;++l) {
                    offset = (numberOfClasses + 5) * l;

                    confidence = MathFunctions.sigmoid(active[0][j][k][offset + 4]);

                    if(confidence < threshold)
                        continue;

                    if(numberOfClasses == 1) {
                        maxClass = 1;

                        objectId = 0;
                    } else {
                        for(int m=0;m<numberOfClasses;++m)
                            confidenceClasses[m] = active[0][j][k][offset + 5 + m];

                        objectId = determineClass(confidenceClasses);

                        maxClass = confidenceClasses[objectId];
                    }

                    confidence *= maxClass;

                    if(confidence > threshold) {
                        x = (k + MathFunctions.sigmoid(active[0][j][k][offset])) * (float)size / outputSizes[i];
                        y = (j + MathFunctions.sigmoid(active[0][j][k][offset + 1])) * (float)size / outputSizes[i];

                        w = (float)Math.exp(active[0][j][k][offset + 2]) * anchors[(i * 3 * 2) + (l * 2)];
                        h = (float)Math.exp(active[0][j][k][offset + 3]) * anchors[(i * 3 * 2) + (l * 2) + 1];

                        results.add(new Detection(objectId, confidence,
                                new RectF(Math.max(0, x - w / 2), Math.max(0, y - h / 2), Math.min(size - 1, x + w / 2), Math.min(size - 1, y + h / 2))));
                    }
                }

        active = output2;
    }

    return pooling.execute(results);

The most inner loop is for the anchors. pooling is a custom lambda function that will execute some kind of ROI pooling. determineClass returns the index with the highest confidence, maybe softmax should also be applied to the confidences.

dkashkin commented 5 years ago

This discussion is very helpful! Thank you for posting these scripts. Can you please confirm if you were able to get the TFLite version of YoloV3 fully working on Android? I am a bit confused by the code snippets because I see that the conversion script creates a Quantized model (8 bit) but the Java inference code uses the Float datatype.... Also, I am very curious what kind of inference timing you get on an actual mobile phone?

marco8chong commented 5 years ago

@dkashkin I am not able to get YOLOv3 works on Android, but using other object detection models from Google is more straightforward.

BenSchl commented 5 years ago

This discussion is very helpful! Thank you for posting these scripts. Can you please confirm if you were able to get the TFLite version of YoloV3 fully working on Android? I am a bit confused by the code snippets because I see that the conversion script creates a Quantized model (8 bit) but the Java inference code uses the Float datatype.... Also, I am very curious what kind of inference timing you get on an actual mobile phone?

I'm sorry for the confusion. The code snippets were supposed to serve as example of extracting the bounding boxes of the output tensors, I did not look at the python snippets and missed the quantization.

My model is a non-quantized tiny yolov3 tflite model, which I did not obtain through the scripts here (I found a way to convert a tiny yolov3 darknet model to tflite, training in darknet is much more comfortable for me (convert cfg + weights with https://github.com/xiaochus/YOLOv3 to h5 -> convert h5 to pb -> convert pb to tflite)). The inference time is about 800ms (I forgot the exact time) on a Asus Zenfone 3 Zoom S (the Unihertz Atom needs a little bit more time).

dkashkin commented 5 years ago

Thanks @BenSchl I've been testing performance of quantized vs non-quantized models on Android with pretty surprising results (https://stackoverflow.com/questions/55958129/tensorflow-lite-quantization-fails-to-improve-inference-latency) Anyways my next goal is to test the Tiny YOLOv3. I already converted the Darknet model to Tensorflow Lite (using DW2TF), but struggling to figure out how to post process the inference results. Can you please share the link or code snippet that you use to handle the output tensor?

BenSchl commented 5 years ago

The code I use is the code snippet I posted above (a link to the YOLOv3 paper is there, also).

The output tensors have the size N x N x [A * (4 + 1 + C)], where N x N is the size of the "grid" (the dimension clusters), A is the number of anchors for that tensor and C is the number of classes. The first 4 values in each offset is for the bound box prediction, the following value is the object confidence (therefore the 4 + 1).

In Tiny YOLOv3, the clusters normally have the dimensions 13x13 and 26x26 (Tiny YOLOv3 has two output tensors). The Anchors for the 13x13 "grid" are often (81,82), (135,169), (344,319) and (10,14), (23,27), (37,58) for the 26x26 "grid" (Therefore the inner loop for(int l=0;l<3;++l) in the code snippet. The most outer loop is for the two output tensors.)

To answer you previous question: To convert Tiny YOLOv3 to tflite (using https://github.com/xiaochus/YOLOv3) I used something like

import tensorflow as tf
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io

with tf.Session() as sess:
    tf.keras.backend.set_learning_phase(0)
    model = tf.keras.models.load_model('yolo.h5')
    constant_graph = graph_util.convert_variables_to_constants(
        sess,
        sess.graph.as_graph_def(),
        ["conv2d_10/BiasAdd", "conv2d_13/BiasAdd"])

graph_io.write_graph(constant_graph, "saved_model_pb", 
                    "yolo.pb", as_text=False)

to convert the h5 model to pb and something like (but with the cpu version of tensorflow)

import tensorflow as tf
from tensorflow.python.platform import gfile
from tensorflow import lite

with tf.Session() as sess:
    tf.keras.backend.set_learning_phase(0)
    with gfile.FastGFile('saved_model_pb/yolo.pb', 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        g_in = tf.import_graph_def(graph_def)
        converter = lite.TFLiteConverter.from_session(sess, [sess.graph.get_tensor_by_name('import/input_1:0')], [sess.graph.get_tensor_by_name('import/conv2d_10/BiasAdd:0'), sess.graph.get_tensor_by_name('import/conv2d_13/BiasAdd:0')])
        tflite_model = converter.convert()
        open("yolo.tflite", "wb").write(tflite_model)

to convert the pb model to tflite.

dkashkin commented 5 years ago

Thank you very much @BenSchl I was able to convert Tiny Yolo V3 to TFLite and test it out on a few android phones. I am getting 770ms inference on Pixel2 (using float 416x416 model). Needless to say, this is too slow to support yolo's original "real time" claim :) This detector is much slower than SSD-based googlenet v2 although a bit higher resolution. Anyways I hugely appreciate your help and I owe you a large amount of beer :)

peace195 commented 5 years ago

I finished it here, please try: https://github.com/peace195/tensorflow-lite-yolo-v3

AadeIT commented 4 years ago

@marco8chong 你flask是如何进行改变的（web），我需要你的帮助，谢谢

marco8chong commented 4 years ago

我現在查閱不到自己以前的 Code, 用 Flask 不用轉換 Model 格式, 當自己用 PC 版本便可, 方法大概如下: https://blog.techbridge.cc/2018/11/01/python-flask-keras-image-predict-api/

AadeIT commented 4 years ago

@marco8chong 当时您在使用flask时有遇到graph的问题吗？我现在这个点上很困扰。我把这个项目整合到flask里面，flask的线程机制和tensorflow发生了矛盾导致graph发生报错。

marco8chong commented 4 years ago

沒有, Tensorflow 可以處理到線程的問題, 小心跟著網上的示範做應該是可以的

https://blog.victormeunier.com/posts/keras_multithread/

AadeIT commented 4 years ago

@marco8chong 谢谢您的回答，我的问题就在这里，在您给我的这篇文档中“This will load your model with the default graph and session from Tensorflow. If you try to do that in multiple threads, you'll have an error.”，这句话的原因是因为不同的线程加载不同的graph容易出现报错所以需要在每一个线程中设置thread_graph = Graph()设置为默认的graph，当你使用flask+keras时也会出现这个问题，第一遍的使用是正常的，第二遍的时候会显示graph问题，所以需要使用graph.as_default()这句函数，但是这个项目中使用的预测函数不是predict所以我很懊恼，不知道graph.as_default()这句话要加在哪里。

phanxuanduc1996 commented 4 years ago

You can see at here. I tried it and success. https://github.com/phanxuanduc1996/convert_yolo_weights

wuwou commented 4 years ago

The error is a result of the line model_output = tf.saved_model.utils.build_tensor_info(model.output). The YOLO models have multiple outputs and you need to handle each output separately (and adjust the signature ('output1': model_output[0], 'output2': model_output[1], ...)). I tried the modified code and faced other errors which I'm not able to fix (tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value conv2d_9/kernel [[node conv2d_9/kernel (defined at C:\venv4\lib\site-packages\keras\backend\tensorflow_backend.py:402) ]] [[batch_normalization_9/beta/_161]]).

Update:

If the model is compiled, the error changes to tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value batch_normalization_10/beta [[{{node batch_normalization_10/beta}}]] [[conv2d_6/kernel/_209]].

(Note: I'm using the tiny model).

@BenSchl Hello，Have you solved your problem?Now I have the same problem as you, and I don't know how to correct it

LahiRumesh commented 3 years ago

I converted Keras h5 format to TensorFlow serving. Please check this out. https://github.com/LahiRumesh/YOLOv3-weights-converter

qqwweee / keras-yolo3

Do anyone tried to convert trained .h5 model to Tensorflow Lite or Tensorflow Serving? #344