PINTO0309 / openvino2tensorflow

This script converts the ONNX/OpenVINO IR model to Tensorflow's saved_model, tflite, h5, tfjs, tftrt(TensorRT), CoreML, EdgeTPU, ONNX and pb. PyTorch (NCHW) -> ONNX (NCHW) -> OpenVINO (NCHW) -> openvino2tensorflow -> Tensorflow/Keras (NHWC/NCHW) -> TFLite (NHWC/NCHW). And the conversion from .pb to saved_model and from saved_model to .pb and from .pb to .tflite and saved_model to .tflite and saved_model to onnx. Support for building environments with Docker. It is possible to directly access the host PC GUI and the camera to verify the operation. NVIDIA GPU (dGPU) support. Intel iHD GPU (iGPU) support.
MIT License
338 stars 40 forks source link

Convert yolov3 to custom yolov3 for edge NPU #81

Closed Valdiolus closed 2 years ago

Valdiolus commented 2 years ago

Hi! I have an issue - to run yolov3 (I think the same with v4/v5) on the custom NPU I need to convert some layers in the more simple forms (FusedBatchNormV3, ResizeNearestNeighbor, LeakyRelu) and than do more special compilations, based on tflite.

Here is what we have for default in yolov3 when convert .weights to .pb:

yolov3_common

And here is an example what I need (I have .pb as an exmaple but I don't have source weights):

yolov3_custom

I found your repos (openvino2tensorflow, OpenVINO-YoloV3), but I think my case is "special".

Can you suggest how can I do this conversion?

PINTO0309 commented 2 years ago

Too little information is available to give advice.

  1. Where did you get the .weights and protocolbuffer file that you own?
  2. What was the first framework you trained your model on? Keras? TensorFlow? Darknet? PyTorch?

Whatever the situation, you need to freeze the BatchNormalization first. Therefore, it is important to know if Freeze is performed at the stage of extracting weights from the original model. If it is not Freeze, any transformation procedure will result in invalid inference results.

Here are the links to the source repository and my working notes when I converted in a similar situation. https://github.com/Ascend-Research/HeadPoseEstimation-WHENet.git https://zenn.dev/pinto0309/scraps/1849b6909db13b

import tensorflow as tf
from keras.models import load_model
from keras import backend as K
from tensorflow.python.framework import graph_io
from tensorflow.python.framework import graph_util
def freeze_graph(session, output, save_pb_dir='.', save_pb_name='frozen_model.pb', save_pb_as_text=False):
    graph = session.graph
    with graph.as_default():
        graphdef_inf = tf.graph_util.remove_training_nodes(graph.as_graph_def())
        graphdef_frozen = graph_util.convert_variables_to_constants(session, graphdef_inf, output)
        graph_io.write_graph(graphdef_frozen, save_pb_dir, save_pb_name, as_text=save_pb_as_text)
        return graphdef_frozen

K.clear_session()
K.set_learning_phase(0)
model = load_model('model.h5')
session = K.get_session()
freeze_graph(session, [out.op.name for out in model.outputs], save_pb_dir='.')

If the model is Freeze, I am sure that everything you are expecting is feasible. FusedBatchNormV3 is the only problem. If you do not provide any information other than what you initially provided, I will close as there is nothing I can do.

Valdiolus commented 2 years ago

Thank you for a quick responce!

  1. My main task is to have source weights, which I will be able to retrain using custom dataset and than compile it to custom format (.sim) to use on the custom NPU. I have frozen yolov3.pb file as an example, which is successfully compilible(to .sim format), but I can't retrain it. I got yolov3.weights from https://pjreddie.com/ website and used https://github.com/mystic123/tensorflow-yolo-v3 to get frozen .pb file with FusedBatchNormV3, but I can't compile it to (.sim) because of the differences in those .pb files (I guess NPU can't use some layers).

Here is a code to load and convert pre-trained darknet .weights. Here I see BatchNorm using, and I think that there should be a way to "expand" FusedBatchNormV3 to more primitive layers, which will be compatible with NPU (as it is done in an example).

def load_weights(var_list, weights_file):
        with open(weights_file, "rb") as fp:
            _ = np.fromfile(fp, dtype=np.int32, count=5)

            weights = np.fromfile(fp, dtype=np.float32)

        ptr = 0
        i = 0
        assign_ops = []
        while i < len(var_list) - 1:
            var1 = var_list[i]
            var2 = var_list[i + 1]
            # do something only if we process conv layer
            if 'Conv' in var1.name.split('/')[-2]:
                # check type of next layer
                if 'BatchNorm' in var2.name.split('/')[-2]:
                    # load batch norm params
                    gamma, beta, mean, var = var_list[i + 1:i + 5]
                    batch_norm_vars = [beta, gamma, mean, var]
                    for var in batch_norm_vars:
                        shape = var.shape.as_list()
                        num_params = np.prod(shape)
                        var_weights = weights[ptr:ptr + num_params].reshape(shape)
                        ptr += num_params
                        assign_ops.append(
                            tf.assign(var, var_weights, validate_shape=True))

                    # we move the pointer by 4, because we loaded 4 variables
                    i += 4
                elif 'Conv' in var2.name.split('/')[-2]:
                    # load biases
                    bias = var2
                    bias_shape = bias.shape.as_list()
                    bias_params = np.prod(bias_shape)
                    bias_weights = weights[ptr:ptr +
                                           bias_params].reshape(bias_shape)
                    ptr += bias_params
                    assign_ops.append(
                        tf.assign(bias, bias_weights, validate_shape=True))

                    # we loaded 1 variable
                    i += 1
                # we can load weights of conv layer
                shape = var1.shape.as_list()
                num_params = np.prod(shape)

                var_weights = weights[ptr:ptr + num_params].reshape(
                    (shape[3], shape[2], shape[0], shape[1]))
                # remember to transpose to column-major
                var_weights = np.transpose(var_weights, (2, 3, 1, 0))
                ptr += num_params
                assign_ops.append(
                    tf.assign(var1, var_weights, validate_shape=True))
                i += 1

        return assign_ops

Maybe I am wrong, and I need to create a custom model without FusedBatchNormV3 and train the model from the scratch. But using existing pretrained would be awesome! If you did something similar - please give a link!

  1. Framework is preferable Keras, but darknet is ok too - the main issue is to have way to convert retrained model to frozen.pb and then compile it.

Thanks!

PINTO0309 commented 2 years ago

Not that it matters, but first of all, I have no idea what NPU is.

First, please follow the tutorial below to convert YOLOv3 Darknet weights to ONNX. The rest of the instructions will only be available after that process is complete.

https://github.com/linghu8812/tensorrt_inference/tree/master/Yolov4 or https://github.com/Tianxiaomo/pytorch-YOLOv4

PINTO0309 commented 2 years ago

Closed due to lack of progress.

Valdiolus commented 2 years ago

Thank you, I found the way to compile the model for custom NPU (neural processor unit in SoC). The answer was to find yolo-tensorflow1 rep on github with .weights->.pb conversion script, and use it with the same tf1.14, which I need to use during later compilation.