Closed larrywal-express closed 2 years ago
Start the survey. By the way, I only look at the structure of the model and ask questions. Is this YOLOv5-Lite? https://github.com/PINTO0309/PINTO_model_zoo/tree/main/180_YOLOv5-Lite
@PINTO0309 Yes, but is modified
Thanks for your help. It would be better if you could include the architecture of the model if possible, so that other engineers can easily find the issue when they search for it.
Either way, I'll get to work. Please wait a moment.
@PINTO0309 Okay. Thanks
Since all Gather
slice positions were being processed as NCHW, resulting in an error, I created a JSON to adjust to axis assuming NHWC. This requires you to analyze what is displayed in the error message to see from which layer the slice is misaligned.
The other point is that the last 5D Reshape
and 5D Transpose
transformation shapes are broken and need to be adjusted.
replace.json
{
"format_version": 2,
"layers": [
{
"layer_id": "316",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "320",
"type": "Const",
"replace_mode": "direct",
"values": [
3
]
},
{
"layer_id": "321",
"type": "Const",
"replace_mode": "direct",
"values": [
0
]
},
{
"layer_id": "322",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "377",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "383",
"type": "Const",
"replace_mode": "direct",
"values": [
1
]
},
{
"layer_id": "384",
"type": "Const",
"replace_mode": "direct",
"values": [
0
]
},
{
"layer_id": "385",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "389",
"type": "Const",
"replace_mode": "direct",
"values": [
2
]
},
{
"layer_id": "390",
"type": "Const",
"replace_mode": "direct",
"values": [
0
]
},
{
"layer_id": "391",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "394",
"type": "Const",
"replace_mode": "direct",
"values": [
1,
16,
16,
3,
9
]
},
{
"layer_id": "396",
"type": "Const",
"replace_mode": "direct",
"values": [
0,
3,
1,
2,
4
]
},
{
"layer_id": "432",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "438",
"type": "Const",
"replace_mode": "direct",
"values": [
1
]
},
{
"layer_id": "439",
"type": "Const",
"replace_mode": "direct",
"values": [
0
]
},
{
"layer_id": "440",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "444",
"type": "Const",
"replace_mode": "direct",
"values": [
2
]
},
{
"layer_id": "445",
"type": "Const",
"replace_mode": "direct",
"values": [
0
]
},
{
"layer_id": "446",
"type": "Squeeze",
"replace_mode": "insert_after",
"values": [
0
]
},
{
"layer_id": "449",
"type": "Const",
"replace_mode": "direct",
"values": [
1,
32,
32,
3,
9
]
},
{
"layer_id": "451",
"type": "Const",
"replace_mode": "direct",
"values": [
0,
3,
1,
2,
4
]
}
]
}
Duplicate Issues https://github.com/PINTO0309/openvino2tensorflow/issues/86 https://github.com/PINTO0309/openvino2tensorflow/issues/82 https://github.com/PINTO0309/openvino2tensorflow/issues/77
tflite model_float32.tflite.zip
I have not checked that the model works correctly at all. Please check for yourself and if there are any problems with the structure you can look into it yourself.
@PINTO0309 Wonderful. Thanks for your support. I noted that shape="" of layer id 375, 376, 430, 431, 455, 480 were not replaced. More so, how do i identify values to be 0 or 1 or 2... with reference to json below and can you please recommend the software to view .bin?
{
"layer_id": "320",
"type": "Const",
"replace_mode": "direct",
"values": [
3
]
},
{
"layer_id": "321",
"type": "Const",
"replace_mode": "direct",
"values": [
0
I did not understand what you were expecting. All layers in between Convolution and Transpose are garbage. In other words, ShapeOf
, Gather
, Concat
, and Unsqueeze
were all determined to be unnecessary by the optimizer's optimization. In my experience, layers that are used only for shape estimation have been found to degrade performance during inference, so the behavior of these optimizers makes sense.
"320" and "321" have also been deleted for the same reason. ShapeOf
, Gather
, Concat
, and Unsqueeze
are not needed during inference.
These layers are only needed if the input image is undefined in terms of height and width, and are essentially unnecessary if the input geometry is statically determined.
Thanks. I noticed that the tested output of onnx is different from tflite as below. for this reason, the target box of tflite was not accurate with wrong detection.
ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 15.64478874206543ms
shape: (1, 3840, 9)
[array([[[5.2931595e+00, 9.6953697e+00, 2.4912457e+01, ...,
1.5526780e-01, 2.2650501e-01, 4.0879369e-01],
[2.0134949e+01, 6.2680893e+00, 3.3342621e+01, ...,
2.1189997e-01, 2.2379622e-01, 4.7713119e-01],
[3.8248707e+01, 1.7845745e+00, 3.6460529e+01, ...,
1.0526398e-01, 1.7205426e-01, 8.1319189e-01],
...,
[4.3373840e+02, 4.8147830e+02, 1.3926814e+02, ...,
1.4624476e-02, 1.3875341e-01, 2.2544542e-01],
[4.6488284e+02, 4.8298419e+02, 1.3351683e+02, ...,
1.9375145e-02, 2.0011839e-01, 2.2959533e-01],
[4.8577332e+02, 4.8264972e+02, 1.1711755e+02, ...,
2.0683646e-02, 2.8468835e-01, 3.3594516e-01]]], dtype=float32)]
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 141.2348747253418ms
shape: (1, 3840, 9)
array([[[ 3.55738926e+00, 3.09487915e+00, 1.86278858e+01, ...,
7.89284706e-03, 9.33742523e-03, 9.98364687e-01],
[ 1.97514572e+01, 2.04518318e+00, 2.98684349e+01, ...,
1.28795207e-02, 6.61897659e-03, 9.98675346e-01],
[ 3.79077911e+01, -1.81230640e+00, 3.11474991e+01, ...,
1.60465279e-05, 8.62568617e-04, 9.99998212e-01],
...,
[ 4.29759338e+02, 4.83508667e+02, 1.08239449e+02, ...,
1.31325126e-02, 4.26258028e-01, 3.09236467e-01],
[ 4.56263580e+02, 4.84719513e+02, 1.05472176e+02, ...,
1.98948979e-02, 5.13177693e-01, 4.79941279e-01],
[ 4.89616119e+02, 4.88561218e+02, 1.09522301e+02, ...,
6.32282197e-02, 2.74803936e-01, 6.55043006e-01]]],
dtype=float32)
docker run --gpus all -it --rm \
-v `pwd`:/home/user/workdir \
ghcr.io/pinto0309/openvino2tensorflow:latest
python3 -m onnxsim lite.onnx lite.onnx
$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model lite.onnx \
--data_type FP32
{
"format_version": 2,
"layers": [
{
"layer_id": "358",
"type": "Const",
"replace_mode": "direct",
"values": [
0,
3,
1,
2,
4
]
},
{
"layer_id": "393",
"type": "Const",
"replace_mode": "direct",
"values": [
0,
3,
1,
2,
4
]
}
]
}
openvino2tensorflow \
--model_path lite.xml \
--output_saved_model \
--output_pb \
--output_no_quant_float32_tflite \
--weight_replacement_config replace.json
import onnxruntime
import tensorflow as tf
import time
import numpy as np
from pprint import pprint
H=512 W=512 MODEL='model_float32'
############################################################
onnx_session = onnxruntime.InferenceSession('lite.onnx') input_name = onnx_session.get_inputs()[0].name output_name = onnx_session.get_outputs()[0].name
roop = 1 e = 0.0 result = None inp = np.ones((1,3,H,W), dtype=np.float32) for _ in range(roop): s = time.time() result = onnx_session.run( [output_name], {input_name: inp} ) e += (time.time() - s) print('ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@') print(f'elapsed time: {e/roop*1000}ms') print(f'shape: {result[0].shape}') pprint(result)
############################################################
interpreter = tf.lite.Interpreter(model_path=f'{MODEL}.tflite', num_threads=4) interpreter.allocate_tensors() input_details = interpreter.get_input_details() output_details = interpreter.get_output_details()
roop = 1 e = 0.0 result = None inp = np.ones((1,H,W,3), dtype=np.float32) for _ in range(roop): s = time.time() interpreter.set_tensor(input_details[0]['index'], inp) interpreter.invoke() result = interpreter.get_tensor(output_details[1]['index']) e += (time.time() - s) print('tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@') print(f'elapsed time: {e/roop*1000}ms') print(f'shape: {result.shape}') pprint(result)
```bash
python3 onnx_tflite_test.py
ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 6.359338760375977ms
shape: (1, 3840, 9)
[array([[[5.2931595e+00, 9.6953697e+00, 2.4912457e+01, ...,
1.5526780e-01, 2.2650501e-01, 4.0879369e-01],
[2.0134949e+01, 6.2680893e+00, 3.3342621e+01, ...,
2.1189997e-01, 2.2379622e-01, 4.7713119e-01],
[3.8248707e+01, 1.7845745e+00, 3.6460529e+01, ...,
1.0526398e-01, 1.7205426e-01, 8.1319189e-01],
...,
[4.3373840e+02, 4.8147830e+02, 1.3926814e+02, ...,
1.4624476e-02, 1.3875341e-01, 2.2544542e-01],
[4.6488284e+02, 4.8298419e+02, 1.3351683e+02, ...,
1.9375145e-02, 2.0011839e-01, 2.2959533e-01],
[4.8577332e+02, 4.8264972e+02, 1.1711755e+02, ...,
2.0683646e-02, 2.8468835e-01, 3.3594516e-01]]], dtype=float32)]
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 23.181676864624023ms
shape: (1, 3840, 9)
array([[[5.29315758e+00, 9.69537735e+00, 2.49124527e+01, ...,
1.55268013e-01, 2.26505280e-01, 4.08792853e-01],
[2.01349468e+01, 6.26809692e+00, 3.33426208e+01, ...,
2.11900204e-01, 2.23796397e-01, 4.77130651e-01],
[3.82487106e+01, 1.78458500e+00, 3.64605331e+01, ...,
1.05264165e-01, 1.72054380e-01, 8.13191533e-01],
...,
[4.33738434e+02, 4.81478302e+02, 1.39268143e+02, ...,
1.46242362e-02, 1.38754994e-01, 2.25445867e-01],
[4.64882843e+02, 4.82984192e+02, 1.33516907e+02, ...,
1.93749964e-02, 2.00120524e-01, 2.29595125e-01],
[4.85773315e+02, 4.82649719e+02, 1.17117645e+02, ...,
2.06834618e-02, 2.84691095e-01, 3.35944831e-01]]], dtype=float32)
@PINTO0309 the process was successful with equality, but the obtained detection result on detection.py is still wrong. similar to my earlier image post.
Please attach the code for inference testing. The exchange of correspondence will be inefficient.
https://github.com/zldrobit/yolov5/blob/tf-android/detect.py
Above as testing. I am also trying to run it on Android.
Can you provide me with one still image for testing? I have no good way of knowing what your model is inferring. And is the model you used Float32? Is it INT8? Float16? Please describe the information in detail.
i used Float32.
the model was trained on four classes
Do you have any test code for onnx that you were able to successfully infer?
The figure below shows the ONNX file you provided. So output2 and output3 are 5D.
Only output1 was in 3D. Is there a problem?
What I don't understand is whether it is a problem with my conversion tool or with the structure of ONNX.
The 3D of output1 is okay. but the 5D of output2 and output3 doesn't align with the original yolov5.tflite. For example, output2 is [1,16,16,3,9], while one of yolov5 is [1,256,3,9]. the split [16,16] and [32,32] in my tflite is multiplied in the yolov5.tflite of [256] and [1024].
lite.tflite yolov5.tflite
Am just thinking if there is a way to convert the 5D to 4D, the detection might work very fine.
I'll say it again. The .onnx that I extracted by unzipping the zip file you attached in your first comment is already 5D. You keep pointing to the .tflite file, but it's not the .tflite that's the problem. The problem is with your model.
Check the ONNX file first.
Alright... is there a way to solve this problem? Because the tested onnx is working fine compared to the tflite...
First, I have not seen the structure of the YOLOv5-Lite model. So I do not know if any final structure is correct. However, you said this early in the issue.
Yes, but is modified
However, somewhere in the comments on this issue, it doesn't say how you modified the model. It is obvious that you are already in 5D when you export from PyTorch to ONNX, and all I can say at this point is that you made some mistake when you modified the PyTorch model.
Your lite.onnx
If 4D is correct, please provide the 4D formatted ONNX first, because I don't know the correct structure after Transpose. Note that a discussion of how to correctly export YOLOv5-Lite models from PyTorch to ONNX is beyond the scope of this repository.
above is the yolov5 onnx also in 5D.
i converted the pt to onnx using yolov5 repo. that is the onnx sent to you. I followed all your instructions till the final stage of tflite.
There is a 3D output of the converted model. What could be the problem?
yes.. correct.. the conversion was successful. I am also confused too about the tested detection results
I assure you. It's not a problem with my tools.
The ONNX that you generated from PyTorch has three outputs. Does the official model also have three outputs? If the output of the official model is one instead of three, then there is some mistake at the time you first generate ONNX.
lite.onnx
yes.. correct..
I'm afraid that's probably not correct.
Shouldn't there be three input layers for the output of [1,N,9]
in the first place? Your ONNX lite.onnx
has only two inputs.
First, please consult the experts in the yolov5-lite repository. Then, when you are able to generate ONNX with the correct structure, please come back to this repository. https://github.com/ppogg/YOLOv5-Lite
Above the model generated from customized dataset using original YOLOv5-Lite. I actually used two scale of detection instead of three scale used by the original yolov5.
lite4D.zip @PINTO0309 I have succeeded in converting the onnx in 5D to 4D as attached above. Now want to convert to the tflite, but still facing some errors. Can you please help me out?
I apologize for the inconvenience, but can you issue a separate issue so that the various discussions don't get mixed up?
Issue Type
Support, Others
OS
Windows
OS architecture
x86_64
Programming Language
Python
Framework
PyTorch
Download URL for ONNX / OpenVINO IR
model.zip
Convert Script
Description
@PINTO0309 Thanks for your good work. I obtained the error output log in trying to convert from .xml to saved_model, pb, tflite. Also tried to used replace.json for layer_id 325, but all effort was fruitless. How do i solve this problem?
Relevant Log Output
Source code for simple inference testing code
model.zip