Conversion error: Shape must be rank 1 but is rank 2

larrywal-express commented 2 years ago

Issue Type

Support, Others

OS

Windows

OS architecture

x86_64

Programming Language

Python

Framework

PyTorch

Download URL for ONNX / OpenVINO IR

model.zip

Convert Script

python openvino2tensorflow.py \
--model_path lite_openvino_model/lite.xml \
--output_saved_model \
--output_pb \
--output_weight_quant_tflite

Description

@PINTO0309 Thanks for your good work. I obtained the error output log in trying to convert from .xml to saved_model, pb, tflite. Also tried to used replace.json for layer_id 325, but all effort was fruitless. How do i solve this problem?

Relevant Log Output

ERROR: Exception encountered when calling layer "tf.reshape_6" (type TFOpLambda).

Shape must be rank 1 but is rank 2 for '{{node tf.reshape_6/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT64](Placeholder, tf.reshape_6/Reshape/shape)' with input shapes: [1,1,1,256], [2,1].

Call arguments received:
• tensor=tf.Tensor(shape=(1, 1, 1, 256), dtype=float32)
• shape=['tf.Tensor(shape=(1,), dtype=int64)', 'tf.Tensor(shape=(1,), dtype=int64)']
• name=None
ERROR: model_path : lite_openvino_model/lite.xml
ERROR: weights_path: lite_openvino_model/lite.bin
ERROR: layer_id : 326
ERROR: input_layer0 layer_id=312: KerasTensor(type_spec=TensorSpec(shape=(1, 1, 1, 256), dtype=tf.float32, name=None), name='tf.math.reduce_mean/Mean:0', description="created by layer 'tf.math.reduce_mean'")
ERROR: input_layer1 layer_id=325: tf.Tensor(
[[ 1]
[16]], shape=(2, 1), dtype=int64)
ERROR: The trace log is below.
Traceback (most recent call last):
File "openvino2tensorflow.py", line 3758, in convert
tf_layers_dict[layer_id] = tf.reshape(op1, shape)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "D:\Anaconda3\lib\site-packages\keras\layers\core\tf_op_layer.py", line 107, in handle
return TFOpLambda(op)(*args, **kwargs)
File "D:\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.reshape_6" (type TFOpLambda).

Shape must be rank 1 but is rank 2 for '{{node tf.reshape_6/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT64](Placeholder, tf.reshape_6/Reshape/shape)' with input shapes: [1,1,1,256], [2,1].

Call arguments received:
• tensor=tf.Tensor(shape=(1, 1, 1, 256), dtype=float32)
• shape=['tf.Tensor(shape=(1,), dtype=int64)', 'tf.Tensor(shape=(1,), dtype=int64)']
• name=None

Source code for simple inference testing code

model.zip

PINTO0309 commented 2 years ago

Start the survey. By the way, I only look at the structure of the model and ask questions. Is this YOLOv5-Lite? https://github.com/PINTO0309/PINTO_model_zoo/tree/main/180_YOLOv5-Lite

larrywal-express commented 2 years ago

@PINTO0309 Yes, but is modified

PINTO0309 commented 2 years ago

Thanks for your help. It would be better if you could include the architecture of the model if possible, so that other engineers can easily find the issue when they search for it.

Either way, I'll get to work. Please wait a moment.

larrywal-express commented 2 years ago

@PINTO0309 Okay. Thanks

PINTO0309 commented 2 years ago

Since all Gather slice positions were being processed as NCHW, resulting in an error, I created a JSON to adjust to axis assuming NHWC. This requires you to analyze what is displayed in the error message to see from which layer the slice is misaligned.

The other point is that the last 5D Reshape and 5D Transpose transformation shapes are broken and need to be adjusted.

replace.json

{
"format_version": 2,
"layers": [
    {
        "layer_id": "316",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "320",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            3
        ]
    },
    {
        "layer_id": "321",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0
        ]
    },
    {
        "layer_id": "322",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "377",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "383",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            1
        ]
    },
    {
        "layer_id": "384",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0
        ]
    },
    {
        "layer_id": "385",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "389",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            2
        ]
    },
    {
        "layer_id": "390",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0
        ]
    },
    {
        "layer_id": "391",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "394",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            1,
            16,
            16,
            3,
            9
        ]
    },
    {
        "layer_id": "396",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0,
            3,
            1,
            2,
            4
        ]
    },
    {
        "layer_id": "432",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "438",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            1
        ]
    },
    {
        "layer_id": "439",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0
        ]
    },
    {
        "layer_id": "440",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "444",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            2
        ]
    },
    {
        "layer_id": "445",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0
        ]
    },
    {
        "layer_id": "446",
        "type": "Squeeze",
        "replace_mode": "insert_after",
        "values": [
            0
        ]
    },
    {
        "layer_id": "449",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            1,
            32,
            32,
            3,
            9
        ]
    },
    {
        "layer_id": "451",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0,
            3,
            1,
            2,
            4
        ]
    }
]
}

Duplicate Issues https://github.com/PINTO0309/openvino2tensorflow/issues/86 https://github.com/PINTO0309/openvino2tensorflow/issues/82 https://github.com/PINTO0309/openvino2tensorflow/issues/77
tflite model_float32.tflite.zip

I have not checked that the model works correctly at all. Please check for yourself and if there are any problems with the structure you can look into it yourself.

PINTO0309 commented 2 years ago

model_float32 tflite

larrywal-express commented 2 years ago

@PINTO0309 Wonderful. Thanks for your support. I noted that shape="" of layer id 375, 376, 430, 431, 455, 480 were not replaced. More so, how do i identify values to be 0 or 1 or 2... with reference to json below and can you please recommend the software to view .bin?

        {
            "layer_id": "320",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                3
            ]
        },
        {
            "layer_id": "321",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0

PINTO0309 commented 2 years ago

I did not understand what you were expecting. All layers in between Convolution and Transpose are garbage. In other words, ShapeOf, Gather, Concat, and Unsqueeze were all determined to be unnecessary by the optimizer's optimization. In my experience, layers that are used only for shape estimation have been found to degrade performance during inference, so the behavior of these optimizers makes sense.

openvino
tflite

"320" and "321" have also been deleted for the same reason. ShapeOf, Gather, Concat, and Unsqueeze are not needed during inference.

These layers are only needed if the input image is undefined in terms of height and width, and are essentially unnecessary if the input geometry is statically determined.

larrywal-express commented 2 years ago

Thanks. I noticed that the tested output of onnx is different from tflite as below. for this reason, the target box of tflite was not accurate with wrong detection. tflite

ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 15.64478874206543ms
shape: (1, 3840, 9)
[array([[[5.2931595e+00, 9.6953697e+00, 2.4912457e+01, ...,
         1.5526780e-01, 2.2650501e-01, 4.0879369e-01],
        [2.0134949e+01, 6.2680893e+00, 3.3342621e+01, ...,
         2.1189997e-01, 2.2379622e-01, 4.7713119e-01],
        [3.8248707e+01, 1.7845745e+00, 3.6460529e+01, ...,
         1.0526398e-01, 1.7205426e-01, 8.1319189e-01],
        ...,
        [4.3373840e+02, 4.8147830e+02, 1.3926814e+02, ...,
         1.4624476e-02, 1.3875341e-01, 2.2544542e-01],
        [4.6488284e+02, 4.8298419e+02, 1.3351683e+02, ...,
         1.9375145e-02, 2.0011839e-01, 2.2959533e-01],
        [4.8577332e+02, 4.8264972e+02, 1.1711755e+02, ...,
         2.0683646e-02, 2.8468835e-01, 3.3594516e-01]]], dtype=float32)]
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 141.2348747253418ms
shape: (1, 3840, 9)
array([[[ 3.55738926e+00,  3.09487915e+00,  1.86278858e+01, ...,
          7.89284706e-03,  9.33742523e-03,  9.98364687e-01],
        [ 1.97514572e+01,  2.04518318e+00,  2.98684349e+01, ...,
          1.28795207e-02,  6.61897659e-03,  9.98675346e-01],
        [ 3.79077911e+01, -1.81230640e+00,  3.11474991e+01, ...,
          1.60465279e-05,  8.62568617e-04,  9.99998212e-01],
        ...,
        [ 4.29759338e+02,  4.83508667e+02,  1.08239449e+02, ...,
          1.31325126e-02,  4.26258028e-01,  3.09236467e-01],
        [ 4.56263580e+02,  4.84719513e+02,  1.05472176e+02, ...,
          1.98948979e-02,  5.13177693e-01,  4.79941279e-01],
        [ 4.89616119e+02,  4.88561218e+02,  1.09522301e+02, ...,
          6.32282197e-02,  2.74803936e-01,  6.55043006e-01]]],
      dtype=float32)

onnx

PINTO0309 commented 2 years ago

docker run --gpus all -it --rm \
-v `pwd`:/home/user/workdir \
ghcr.io/pinto0309/openvino2tensorflow:latest

python3 -m onnxsim lite.onnx lite.onnx

$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model lite.onnx \
--data_type FP32

replace.json

{
"format_version": 2,
"layers": [
    {
        "layer_id": "358",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0,
            3,
            1,
            2,
            4
        ]
    },
    {
        "layer_id": "393",
        "type": "Const",
        "replace_mode": "direct",
        "values": [
            0,
            3,
            1,
            2,
            4
        ]
    }
]
}

openvino2tensorflow \
--model_path lite.xml \
--output_saved_model \
--output_pb \
--output_no_quant_float32_tflite \
--weight_replacement_config replace.json

onnx_tflite_test.py


import onnxruntime
import tensorflow as tf
import time
import numpy as np
from pprint import pprint

H=512 W=512 MODEL='model_float32'

############################################################

onnx_session = onnxruntime.InferenceSession('lite.onnx') input_name = onnx_session.get_inputs()[0].name output_name = onnx_session.get_outputs()[0].name

roop = 1 e = 0.0 result = None inp = np.ones((1,3,H,W), dtype=np.float32) for _ in range(roop): s = time.time() result = onnx_session.run( [output_name], {input_name: inp} ) e += (time.time() - s) print('ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@') print(f'elapsed time: {e/roop*1000}ms') print(f'shape: {result[0].shape}') pprint(result)

############################################################

interpreter = tf.lite.Interpreter(model_path=f'{MODEL}.tflite', num_threads=4) interpreter.allocate_tensors() input_details = interpreter.get_input_details() output_details = interpreter.get_output_details()

roop = 1 e = 0.0 result = None inp = np.ones((1,H,W,3), dtype=np.float32) for _ in range(roop): s = time.time() interpreter.set_tensor(input_details[0]['index'], inp) interpreter.invoke() result = interpreter.get_tensor(output_details[1]['index']) e += (time.time() - s) print('tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@') print(f'elapsed time: {e/roop*1000}ms') print(f'shape: {result.shape}') pprint(result)

```bash
python3 onnx_tflite_test.py

ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 6.359338760375977ms
shape: (1, 3840, 9)
[array([[[5.2931595e+00, 9.6953697e+00, 2.4912457e+01, ...,
         1.5526780e-01, 2.2650501e-01, 4.0879369e-01],
        [2.0134949e+01, 6.2680893e+00, 3.3342621e+01, ...,
         2.1189997e-01, 2.2379622e-01, 4.7713119e-01],
        [3.8248707e+01, 1.7845745e+00, 3.6460529e+01, ...,
         1.0526398e-01, 1.7205426e-01, 8.1319189e-01],
        ...,
        [4.3373840e+02, 4.8147830e+02, 1.3926814e+02, ...,
         1.4624476e-02, 1.3875341e-01, 2.2544542e-01],
        [4.6488284e+02, 4.8298419e+02, 1.3351683e+02, ...,
         1.9375145e-02, 2.0011839e-01, 2.2959533e-01],
        [4.8577332e+02, 4.8264972e+02, 1.1711755e+02, ...,
         2.0683646e-02, 2.8468835e-01, 3.3594516e-01]]], dtype=float32)]
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 23.181676864624023ms
shape: (1, 3840, 9)
array([[[5.29315758e+00, 9.69537735e+00, 2.49124527e+01, ...,
         1.55268013e-01, 2.26505280e-01, 4.08792853e-01],
        [2.01349468e+01, 6.26809692e+00, 3.33426208e+01, ...,
         2.11900204e-01, 2.23796397e-01, 4.77130651e-01],
        [3.82487106e+01, 1.78458500e+00, 3.64605331e+01, ...,
         1.05264165e-01, 1.72054380e-01, 8.13191533e-01],
        ...,
        [4.33738434e+02, 4.81478302e+02, 1.39268143e+02, ...,
         1.46242362e-02, 1.38754994e-01, 2.25445867e-01],
        [4.64882843e+02, 4.82984192e+02, 1.33516907e+02, ...,
         1.93749964e-02, 2.00120524e-01, 2.29595125e-01],
        [4.85773315e+02, 4.82649719e+02, 1.17117645e+02, ...,
         2.06834618e-02, 2.84691095e-01, 3.35944831e-01]]], dtype=float32)

larrywal-express commented 2 years ago

@PINTO0309 the process was successful with equality, but the obtained detection result on detection.py is still wrong. similar to my earlier image post.

PINTO0309 commented 2 years ago

Please attach the code for inference testing. The exchange of correspondence will be inefficient.

larrywal-express commented 2 years ago

https://github.com/zldrobit/yolov5/blob/tf-android/detect.py

Above as testing. I am also trying to run it on Android.

PINTO0309 commented 2 years ago

Can you provide me with one still image for testing? I have no good way of knowing what your model is inferring. And is the model you used Float32? Is it INT8? Float16? Please describe the information in detail.

larrywal-express commented 2 years ago

melon_boyang i used Float32.

larrywal-express commented 2 years ago

the model was trained on four classes

PINTO0309 commented 2 years ago

Do you have any test code for onnx that you were able to successfully infer?

larrywal-express commented 2 years ago

1.zip 2.zip

the code above. Meanwhile, i noticed that the output of my tflite model is 5D instead of 4D.

PINTO0309 commented 2 years ago

The figure below shows the ONNX file you provided. So output2 and output3 are 5D.

Only output1 was in 3D. Is there a problem?

What I don't understand is whether it is a problem with my conversion tool or with the structure of ONNX.

larrywal-express commented 2 years ago

The 3D of output1 is okay. but the 5D of output2 and output3 doesn't align with the original yolov5.tflite. For example, output2 is [1,16,16,3,9], while one of yolov5 is [1,256,3,9]. the split [16,16] and [32,32] in my tflite is multiplied in the yolov5.tflite of [256] and [1024].

larrywal-express commented 2 years ago

lite.tflite lite yolov5.tflite yolov5

Am just thinking if there is a way to convert the 5D to 4D, the detection might work very fine.

PINTO0309 commented 2 years ago

I'll say it again. The .onnx that I extracted by unzipping the zip file you attached in your first comment is already 5D. You keep pointing to the .tflite file, but it's not the .tflite that's the problem. The problem is with your model.

Check the ONNX file first.

larrywal-express commented 2 years ago

Alright... is there a way to solve this problem? Because the tested onnx is working fine compared to the tflite...

PINTO0309 commented 2 years ago

First, I have not seen the structure of the YOLOv5-Lite model. So I do not know if any final structure is correct. However, you said this early in the issue.

Yes, but is modified

However, somewhere in the comments on this issue, it doesn't say how you modified the model. It is obvious that you are already in 5D when you export from PyTorch to ONNX, and all I can say at this point is that you made some mistake when you modified the PyTorch model.

Your lite.onnx

If 4D is correct, please provide the 4D formatted ONNX first, because I don't know the correct structure after Transpose. Note that a discussion of how to correctly export YOLOv5-Lite models from PyTorch to ONNX is beyond the scope of this repository.

larrywal-express commented 2 years ago

above is the yolov5 onnx also in 5D.

i converted the pt to onnx using yolov5 repo. that is the onnx sent to you. I followed all your instructions till the final stage of tflite.

PINTO0309 commented 2 years ago

There is a 3D output of the converted model. What could be the problem?

larrywal-express commented 2 years ago

yes.. correct.. the conversion was successful. I am also confused too about the tested detection results

PINTO0309 commented 2 years ago

I assure you. It's not a problem with my tools.

The ONNX that you generated from PyTorch has three outputs. Does the official model also have three outputs? If the output of the official model is one instead of three, then there is some mistake at the time you first generate ONNX.

lite.onnx

yes.. correct..

I'm afraid that's probably not correct. Shouldn't there be three input layers for the output of [1,N,9] in the first place? Your ONNX lite.onnx has only two inputs.

First, please consult the experts in the yolov5-lite repository. Then, when you are able to generate ONNX with the correct structure, please come back to this repository. https://github.com/ppogg/YOLOv5-Lite

larrywal-express commented 2 years ago

shufflenet.zip

Above the model generated from customized dataset using original YOLOv5-Lite. I actually used two scale of detection instead of three scale used by the original yolov5.

larrywal-express commented 2 years ago

lite4D.zip @PINTO0309 I have succeeded in converting the onnx in 5D to 4D as attached above. Now want to convert to the tflite, but still facing some errors. Can you please help me out?

PINTO0309 commented 2 years ago

I apologize for the inconvenience, but can you issue a separate issue so that the various discussions don't get mixed up?

PINTO0309 / openvino2tensorflow