PINTO0309 / PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
https://qiita.com/PINTO
MIT License
3.43k stars 560 forks source link

How to increase batch size when using post_process_gen_tools #413

Closed ozayr closed 3 weeks ago

ozayr commented 3 weeks ago

Issue Type

Support

OS

Mac OS

OS architecture

aarch64

Programming Language

Other

Framework

ONNX

Model name and Weights/Checkpoints URL

https://github.com/PINTO0309/PINTO_model_zoo/tree/main/449_YOLOX-WholeBody12

Description

Hi, thank you for the work

When using post process gen tools how do I increase the batch size parameter to create a model for batch processing

  1. I place a model from the downloaded models in there folder where the name of the model does not have post in it
  2. I then edit the BATCH parameter to eg 30
  3. I run convert_script.sh

I see the model generated but the output shape Is not changed, the log shows correctly


 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(30, 12, 6300)]             0         []                            

 input_2 (InputLayer)        [(None, 3)]                  0         []                            

 tf.compat.v1.gather_nd (TF  (None,)                      0         ['input_1[0][0]',             
 OpLambda)                                                           'input_2[0][0]']             

 tf.__operators__.getitem (  (None, 1)                    0         ['tf.compat.v1.gather_nd[0][0]
 SlicingOpLambda)                                                   ']                            

==================================================================================================

but when running the model and checking input shapes it still shows [1,3,480,640] and not [30,3,480,640]

Relevant Log Output

INFO: MODEL_INDX=1: 02_boxes_scores_6300.onnx, prefix="02"
INFO: MODEL_INDX=2: 03_cxcywh_y1x1y2x2_6300.onnx, prefix="03"
INFO: Finish!
INFO: MODEL_INDX=1: 01_grid_6300.onnx, prefix="None"
INFO: MODEL_INDX=2: 04_boxes_x1y1x2y2_y1x1y2x2_scores_6300.onnx, prefix="None"
INFO: Finish!
INFO: The model is checked!
INFO: The model is checked!
INFO: The model is checked!
INFO: The model is checked!
INFO: MODEL_INDX=1: 05_Constant_max_output_boxes_per_class.onnx, prefix="None"
INFO: MODEL_INDX=2: 08_NonMaxSuppression11.onnx, prefix="None"
INFO: Finish!
INFO: MODEL_INDX=1: 06_Constant_iou_threshold.onnx, prefix="None"
INFO: MODEL_INDX=2: 08_NonMaxSuppression11.onnx, prefix="None"
INFO: Finish!
INFO: MODEL_INDX=1: 07_Constant_score_threshold.onnx, prefix="None"
INFO: MODEL_INDX=2: 08_NonMaxSuppression11.onnx, prefix="None"
INFO: Finish!
INFO: MODEL_INDX=1: 04_boxes_x1y1x2y2_y1x1y2x2_scores_6300.onnx, prefix="None"
INFO: MODEL_INDX=2: 08_NonMaxSuppression11.onnx, prefix="None"
INFO: Finish!
INFO: The model is checked!
INFO: The model is checked!
INFO: MODEL_INDX=1: 11_Constant_workaround_mul.onnx, prefix="None"
INFO: MODEL_INDX=2: 10_Mul11_workaround.onnx, prefix="None"
INFO: Finish!
INFO: MODEL_INDX=1: 09_nms_yolox_6300.onnx, prefix="None"
INFO: MODEL_INDX=2: 11_Constant_workaround_mul.onnx, prefix="None"
INFO: Finish!
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(30, 12, 6300)]             0         []                            

 input_2 (InputLayer)        [(None, 3)]                  0         []                            

 tf.compat.v1.gather_nd (TF  (None,)                      0         ['input_1[0][0]',             
 OpLambda)                                                           'input_2[0][0]']             

 tf.__operators__.getitem (  (None, 1)                    0         ['tf.compat.v1.gather_nd[0][0]
 SlicingOpLambda)                                                   ']                            

==================================================================================================
Total params: 0 (0.00 Byte)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 0 (0.00 Byte)
__________________________________________________________________________________________________
2024-06-23 14:04:12.636667: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:378] Ignored output_format.
2024-06-23 14:04:12.636689: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:381] Ignored drop_control_dependency.
2024-06-23 14:04:12.637382: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: /var/folders/1c/jwsk4s1j0nlfqzycgqdw_3fr0000gn/T/tmph7zot1sx
2024-06-23 14:04:12.637658: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve }
2024-06-23 14:04:12.637668: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: /var/folders/1c/jwsk4s1j0nlfqzycgqdw_3fr0000gn/T/tmph7zot1sx
2024-06-23 14:04:12.638570: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:382] MLIR V1 optimization pass is not enabled
2024-06-23 14:04:12.638918: I tensorflow/cc/saved_model/loader.cc:233] Restoring SavedModel bundle.
2024-06-23 14:04:12.667281: I tensorflow/cc/saved_model/loader.cc:217] Running initialization op on SavedModel bundle at path: /var/folders/1c/jwsk4s1j0nlfqzycgqdw_3fr0000gn/T/tmph7zot1sx
2024-06-23 14:04:12.671103: I tensorflow/cc/saved_model/loader.cc:316] SavedModel load for tags { serve }; Status: success: OK. Took 33722 microseconds.
2024-06-23 14:04:12.759996: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-06-23 14:04:12.943485: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:2245] Estimated count of arithmetic ops: 0  ops, equivalently 0  MACs
/Users/ghost/mambaforge/envs/pinto/lib/python3.10/runpy.py:126: RuntimeWarning: 'tf2onnx.convert' found in sys.modules after import of package 'tf2onnx', but prior to execution of 'tf2onnx.convert'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
2024-06-23 14:04:16,200 - WARNING - ***IMPORTANT*** Installed protobuf is not cpp accelerated. Conversion will be extremely slow. See https://github.com/onnx/tensorflow-onnx/issues/1557
2024-06-23 14:04:16,201 - INFO - Using tensorflow=2.14.0, onnx=1.16.1, tf2onnx=1.16.1/15c810
2024-06-23 14:04:16,201 - INFO - Using opset <onnx, 11>
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
2024-06-23 14:04:16,223 - INFO - Optimizing ONNX model
2024-06-23 14:04:16,244 - INFO - After optimization: Cast -1 (1->0), Const -2 (6->4), Identity -1 (1->0)
2024-06-23 14:04:16,246 - INFO - 
2024-06-23 14:04:16,246 - INFO - Successfully converted TensorFlow model saved_model_postprocess/nms_score_gather_nd.tflite to ONNX
2024-06-23 14:04:16,246 - INFO - Model inputs: ['serving_default_input_1:0', 'serving_default_input_2:0']
2024-06-23 14:04:16,246 - INFO - Model outputs: ['PartitionedCall:0']
2024-06-23 14:04:16,246 - INFO - ONNX model is saved at 12_nms_score_gather_nd.onnx
INFO: Finish!
INFO: Finish!
INFO: Finish!
INFO: Finish!
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃            ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant   │ 4              │ 4                │
│ GatherND   │ 1              │ 1                │
│ Reshape    │ 1              │ 1                │
│ Slice      │ 1              │ 1                │
│ Model Size │ 908.0B         │ 907.0B           │
└────────────┴────────────────┴──────────────────┘
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃            ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant   │ 4              │ 4                │
│ GatherND   │ 1              │ 1                │
│ Reshape    │ 1              │ 1                │
│ Slice      │ 1              │ 1                │
│ Model Size │ 907.0B         │ 907.0B           │
└────────────┴────────────────┴──────────────────┘
INFO: MODEL_INDX=1: 09_nms_yolox_6300.onnx, prefix="None"
INFO: MODEL_INDX=2: 12_nms_score_gather_nd.onnx, prefix="None"
INFO: Finish!
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 3              │ 3                │
│ Concat            │ 2              │ 2                │
│ Constant          │ 19             │ 19               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ GatherND          │ 1              │ 1                │
│ Mul               │ 3              │ 3                │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 1              │ 1                │
│ ScatterND         │ 2              │ 2                │
│ Slice             │ 10             │ 10               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 1              │ 1                │
│ Model Size        │ 669.6KiB       │ 669.6KiB         │
└───────────────────┴────────────────┴──────────────────┘
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 3              │ 3                │
│ Concat            │ 2              │ 2                │
│ Constant          │ 19             │ 19               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ GatherND          │ 1              │ 1                │
│ Mul               │ 3              │ 3                │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 1              │ 1                │
│ ScatterND         │ 2              │ 2                │
│ Slice             │ 10             │ 10               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 1              │ 1                │
│ Model Size        │ 669.6KiB       │ 669.6KiB         │
└───────────────────┴────────────────┴──────────────────┘
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(1, 5040, 4)]               0         []                            

 input_2 (InputLayer)        [(None, 2)]                  0         []                            

 tf.compat.v1.gather_nd (TF  (None, 4)                    0         ['input_1[0][0]',             
 OpLambda)                                                           'input_2[0][0]']             

 tf.cast (TFOpLambda)        (None, 4)                    0         ['tf.compat.v1.gather_nd[0][0]
                                                                    ']                            

==================================================================================================
Total params: 0 (0.00 Byte)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 0 (0.00 Byte)
__________________________________________________________________________________________________
2024-06-23 14:04:26.038981: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:378] Ignored output_format.
2024-06-23 14:04:26.039002: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:381] Ignored drop_control_dependency.
2024-06-23 14:04:26.039617: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: /var/folders/1c/jwsk4s1j0nlfqzycgqdw_3fr0000gn/T/tmp10bw6nau
2024-06-23 14:04:26.039865: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve }
2024-06-23 14:04:26.039873: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: /var/folders/1c/jwsk4s1j0nlfqzycgqdw_3fr0000gn/T/tmp10bw6nau
2024-06-23 14:04:26.040368: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:382] MLIR V1 optimization pass is not enabled
2024-06-23 14:04:26.040563: I tensorflow/cc/saved_model/loader.cc:233] Restoring SavedModel bundle.
2024-06-23 14:04:26.045629: I tensorflow/cc/saved_model/loader.cc:217] Running initialization op on SavedModel bundle at path: /var/folders/1c/jwsk4s1j0nlfqzycgqdw_3fr0000gn/T/tmp10bw6nau
2024-06-23 14:04:26.049429: I tensorflow/cc/saved_model/loader.cc:316] SavedModel load for tags { serve }; Status: success: OK. Took 9812 microseconds.
2024-06-23 14:04:26.054724: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-06-23 14:04:26.064253: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:2245] Estimated count of arithmetic ops: 0  ops, equivalently 0  MACs
/Users/ghost/mambaforge/envs/pinto/lib/python3.10/runpy.py:126: RuntimeWarning: 'tf2onnx.convert' found in sys.modules after import of package 'tf2onnx', but prior to execution of 'tf2onnx.convert'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
2024-06-23 14:04:29,341 - WARNING - ***IMPORTANT*** Installed protobuf is not cpp accelerated. Conversion will be extremely slow. See https://github.com/onnx/tensorflow-onnx/issues/1557
2024-06-23 14:04:29,342 - INFO - Using tensorflow=2.14.0, onnx=1.16.1, tf2onnx=1.16.1/15c810
2024-06-23 14:04:29,342 - INFO - Using opset <onnx, 11>
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
2024-06-23 14:04:29,349 - INFO - Optimizing ONNX model
2024-06-23 14:04:29,356 - INFO - After optimization: Identity -1 (1->0)
2024-06-23 14:04:29,358 - INFO - 
2024-06-23 14:04:29,358 - INFO - Successfully converted TensorFlow model saved_model_postprocess/nms_box_gather_nd.tflite to ONNX
2024-06-23 14:04:29,358 - INFO - Model inputs: ['serving_default_input_1:0', 'serving_default_input_2:0']
2024-06-23 14:04:29,358 - INFO - Model outputs: ['PartitionedCall:0']
2024-06-23 14:04:29,358 - INFO - ONNX model is saved at 14_nms_box_gather_nd.onnx
INFO: Finish!
INFO: Finish!
INFO: Finish!
INFO: Finish!
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃            ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant   │ 0              │ 0                │
│ GatherND   │ 1              │ 1                │
│ Model Size │ 270.0B         │ 270.0B           │
└────────────┴────────────────┴──────────────────┘
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃            ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant   │ 0              │ 0                │
│ GatherND   │ 1              │ 1                │
│ Model Size │ 270.0B         │ 270.0B           │
└────────────┴────────────────┴──────────────────┘
INFO: MODEL_INDX=1: 09_nms_yolox_6300_nd.onnx, prefix="main01"
INFO: MODEL_INDX=2: 13_nms_final_batch_nums_final_class_nums_final_box_nums.onnx, prefix="sub01"
INFO: Finish!
INFO: MODEL_INDX=1: 15_nms_yolox_6300_split.onnx, prefix="None"
INFO: MODEL_INDX=2: 14_nms_box_gather_nd.onnx, prefix="None"
INFO: Finish!
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 3              │ 3                │
│ Cast              │ 2              │ 2                │
│ Concat            │ 2              │ 2                │
│ Constant          │ 20             │ 20               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ Gather            │ 1              │ 1                │
│ GatherND          │ 2              │ 2                │
│ Mul               │ 3              │ 3                │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 1              │ 1                │
│ ScatterND         │ 2              │ 2                │
│ Slice             │ 12             │ 12               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 1              │ 1                │
│ Model Size        │ 671.9KiB       │ 671.9KiB         │
└───────────────────┴────────────────┴──────────────────┘
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 3              │ 3                │
│ Cast              │ 2              │ 2                │
│ Concat            │ 2              │ 2                │
│ Constant          │ 20             │ 20               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ Gather            │ 1              │ 1                │
│ GatherND          │ 2              │ 2                │
│ Mul               │ 3              │ 3                │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 1              │ 1                │
│ ScatterND         │ 2              │ 2                │
│ Slice             │ 12             │ 12               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 1              │ 1                │
│ Model Size        │ 671.9KiB       │ 671.9KiB         │
└───────────────────┴────────────────┴──────────────────┘
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃            ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Concat     │ 1              │ 1                │
│ Constant   │ 0              │ 0                │
│ Model Size │ 301.0B         │ 301.0B           │
└────────────┴────────────────┴──────────────────┘
INFO: MODEL_INDX=1: 16_nms_yolox_6300_merged.onnx, prefix="None"
INFO: MODEL_INDX=2: 17_nms_batchno_classid_x1y1x2y2_cat.onnx, prefix="None"
INFO: Finish!
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 3              │ 3                │
│ Cast              │ 2              │ 2                │
│ Concat            │ 3              │ 3                │
│ Constant          │ 20             │ 20               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ Gather            │ 1              │ 1                │
│ GatherND          │ 2              │ 2                │
│ Mul               │ 3              │ 3                │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 1              │ 1                │
│ ScatterND         │ 2              │ 2                │
│ Slice             │ 12             │ 12               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 1              │ 1                │
│ Model Size        │ 672.0KiB       │ 672.0KiB         │
└───────────────────┴────────────────┴──────────────────┘
INFO: MODEL_INDX=1: yolox_s_wholebody12_0190_30x3x480x640.onnx, prefix="None"
INFO: MODEL_INDX=2: 18_nms_yolox_6300.onnx, prefix="None"
INFO: Finish!
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 10             │ 10               │
│ Cast              │ 2              │ 2                │
│ Concat            │ 21             │ 21               │
│ Constant          │ 194            │ 194              │
│ Conv              │ 83             │ 83               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ Gather            │ 1              │ 1                │
│ GatherND          │ 2              │ 2                │
│ MaxPool           │ 3              │ 3                │
│ Mul               │ 77             │ 77               │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 4              │ 4                │
│ Resize            │ 2              │ 2                │
│ ScatterND         │ 2              │ 2                │
│ Sigmoid           │ 80             │ 80               │
│ Slice             │ 16             │ 16               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 2              │ 2                │
│ Model Size        │ 34.8MiB        │ 34.8MiB          │
└───────────────────┴────────────────┴──────────────────┘
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add               │ 10             │ 10               │
│ Cast              │ 2              │ 2                │
│ Concat            │ 21             │ 21               │
│ Constant          │ 194            │ 194              │
│ Conv              │ 83             │ 83               │
│ Div               │ 2              │ 2                │
│ Exp               │ 1              │ 1                │
│ Gather            │ 1              │ 1                │
│ GatherND          │ 2              │ 2                │
│ MaxPool           │ 3              │ 3                │
│ Mul               │ 77             │ 77               │
│ NonMaxSuppression │ 1              │ 1                │
│ Reshape           │ 4              │ 4                │
│ Resize            │ 2              │ 2                │
│ ScatterND         │ 2              │ 2                │
│ Sigmoid           │ 80             │ 80               │
│ Slice             │ 16             │ 16               │
│ Sub               │ 2              │ 2                │
│ Transpose         │ 2              │ 2                │
│ Model Size        │ 34.8MiB        │ 34.8MiB          │
└───────────────────┴────────────────┴──────────────────┘

URL or source code for simple inference testing code

the Script

I changed

LOWEROP=${OP,,} 
to
LOWEROP=$(echo "$OP" | tr '[:upper:]' '[:lower:]') 

as bash was complaining

#!/bin/bash

#pip install -U pip && pip install onnxsim && pip install -U simple-onnx-processing-tools && pip install -U onnx && python3 -m pip install -U onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com  && pip install tensorflow==2.14.0

MODEL_NAME=yolox_s_wholebody12_0190
SUFFIX="30x3x"

OPSET=11
BATCHES=30
CLASSES=12

RESOLUTIONS=(

    "480 640 6300"

)

for((i=0; i<${#RESOLUTIONS[@]}; i++))
do
    RESOLUTION=(`echo ${RESOLUTIONS[i]}`)
    H=${RESOLUTION[0]}
    W=${RESOLUTION[1]}
    BOXES=${RESOLUTION[2]}

    ################################################### Grids
    python make_grids.py -o ${OPSET} -x ${BOXES} -c ${CLASSES} -ih ${H} -iw ${W}

    ################################################### Boxes + Scores
    python make_boxes_scores.py -o ${OPSET} -b ${BATCHES} -x ${BOXES} -c ${CLASSES}
    python make_cxcywh_y1x1y2x2.py -o ${OPSET} -b ${BATCHES} -x ${BOXES}

    snc4onnx \
    --input_onnx_file_paths 02_boxes_scores_${BOXES}.onnx 03_cxcywh_y1x1y2x2_${BOXES}.onnx \
    --srcop_destop boxes_cxcywh cxcywh \
    --op_prefixes_after_merging 02 03 \
    --output_onnx_file_path 04_boxes_x1y1x2y2_y1x1y2x2_scores_${BOXES}.onnx

    snc4onnx \
    --input_onnx_file_paths 01_grid_${BOXES}.onnx 04_boxes_x1y1x2y2_y1x1y2x2_scores_${BOXES}.onnx \
    --srcop_destop grid_output boxes_scores_input \
    --output_onnx_file_path 04_boxes_x1y1x2y2_y1x1y2x2_scores_${BOXES}.onnx

    ################################################### NonMaxSuppression
    sog4onnx \
    --op_type Constant \
    --opset ${OPSET} \
    --op_name max_output_boxes_per_class_const \
    --output_variables max_output_boxes_per_class int64 [1] \
    --attributes value int64 [20] \
    --output_onnx_file_path 05_Constant_max_output_boxes_per_class.onnx

    sog4onnx \
    --op_type Constant \
    --opset ${OPSET} \
    --op_name iou_threshold_const \
    --output_variables iou_threshold float32 [1] \
    --attributes value float32 [0.40] \
    --output_onnx_file_path 06_Constant_iou_threshold.onnx

    sog4onnx \
    --op_type Constant \
    --opset ${OPSET} \
    --op_name score_threshold_const \
    --output_variables score_threshold float32 [1] \
    --attributes value float32 [0.25] \
    --output_onnx_file_path 07_Constant_score_threshold.onnx

    OP=NonMaxSuppression
    LOWEROP=$(echo "$OP" | tr '[:upper:]' '[:lower:]')
    sog4onnx \
    --op_type ${OP} \
    --opset ${OPSET} \
    --op_name ${LOWEROP}${OPSET} \
    --input_variables boxes_var float32 [${BATCHES},${BOXES},4] \
    --input_variables scores_var float32 [${BATCHES},${CLASSES},${BOXES}] \
    --input_variables max_output_boxes_per_class_var int64 [1] \
    --input_variables iou_threshold_var float32 [1] \
    --input_variables score_threshold_var float32 [1] \
    --output_variables selected_indices int64 [\'N\',3] \
    --attributes center_point_box int64 0 \
    --output_onnx_file_path 08_${OP}${OPSET}.onnx

    snc4onnx \
    --input_onnx_file_paths 05_Constant_max_output_boxes_per_class.onnx 08_${OP}${OPSET}.onnx \
    --srcop_destop max_output_boxes_per_class max_output_boxes_per_class_var \
    --output_onnx_file_path 08_${OP}${OPSET}.onnx

    snc4onnx \
    --input_onnx_file_paths 06_Constant_iou_threshold.onnx 08_${OP}${OPSET}.onnx \
    --srcop_destop iou_threshold iou_threshold_var \
    --output_onnx_file_path 08_${OP}${OPSET}.onnx

    snc4onnx \
    --input_onnx_file_paths 07_Constant_score_threshold.onnx 08_${OP}${OPSET}.onnx \
    --srcop_destop score_threshold score_threshold_var \
    --output_onnx_file_path 08_${OP}${OPSET}.onnx

    # soc4onnx \
    # --input_onnx_file_path 08_${OP}${OPSET}.onnx \
    # --output_onnx_file_path 08_${OP}${OPSET}.onnx \
    # --opset ${OPSET}

    ################################################### Boxes + Scores + NonMaxSuppression
    snc4onnx \
    --input_onnx_file_paths 04_boxes_x1y1x2y2_y1x1y2x2_scores_${BOXES}.onnx 08_${OP}${OPSET}.onnx \
    --srcop_destop scores scores_var y1x1y2x2 boxes_var \
    --output_onnx_file_path 09_nms_yolox_${BOXES}.onnx

    ################################################### Myriad workaround Mul
    OP=Mul
    LOWEROP=$(echo "$OP" | tr '[:upper:]' '[:lower:]')
    OPSET=${OPSET}
    sog4onnx \
    --op_type ${OP} \
    --opset ${OPSET} \
    --op_name ${LOWEROP}${OPSET} \
    --input_variables workaround_mul_a int64 [\'N\',3] \
    --input_variables workaround_mul_b int64 [1] \
    --output_variables workaround_mul_out int64 [\'N\',3] \
    --output_onnx_file_path 10_${OP}${OPSET}_workaround.onnx

    ############ Myriad workaround Constant
    sog4onnx \
    --op_type Constant \
    --opset ${OPSET} \
    --op_name workaround_mul_const_op \
    --output_variables workaround_mul_const int64 [1] \
    --attributes value int64 [1] \
    --output_onnx_file_path 11_Constant_workaround_mul.onnx

    ############ Myriad workaround Mul + Myriad workaround Constant
    snc4onnx \
    --input_onnx_file_paths 11_Constant_workaround_mul.onnx 10_${OP}${OPSET}_workaround.onnx \
    --srcop_destop workaround_mul_const workaround_mul_b \
    --output_onnx_file_path 11_Constant_workaround_mul.onnx

    ################################################### NonMaxSuppression + Myriad workaround Mul
    snc4onnx \
    --input_onnx_file_paths 09_nms_yolox_${BOXES}.onnx 11_Constant_workaround_mul.onnx \
    --srcop_destop selected_indices workaround_mul_a \
    --output_onnx_file_path 09_nms_yolox_${BOXES}.onnx

    ################################################### Score GatherND
    python make_score_gather_nd.py -b ${BATCHES} -x ${BOXES} -c ${CLASSES}

    python -m tf2onnx.convert \
    --opset ${OPSET} \
    --tflite saved_model_postprocess/nms_score_gather_nd.tflite \
    --output 12_nms_score_gather_nd.onnx

    sor4onnx \
    --input_onnx_file_path 12_nms_score_gather_nd.onnx \
    --old_new ":0" "" \
    --search_mode "suffix_match" \
    --output_onnx_file_path 12_nms_score_gather_nd.onnx

    sor4onnx \
    --input_onnx_file_path 12_nms_score_gather_nd.onnx \
    --old_new "serving_default_input_1" "gn_scores" \
    --output_onnx_file_path 12_nms_score_gather_nd.onnx \
    --mode inputs

    sor4onnx \
    --input_onnx_file_path 12_nms_score_gather_nd.onnx \
    --old_new "serving_default_input_2" "gn_selected_indices" \
    --output_onnx_file_path 12_nms_score_gather_nd.onnx \
    --mode inputs

    sor4onnx \
    --input_onnx_file_path 12_nms_score_gather_nd.onnx \
    --old_new "PartitionedCall" "final_scores" \
    --output_onnx_file_path 12_nms_score_gather_nd.onnx \
    --mode outputs

    python make_input_output_shape_update.py \
    --input_onnx_file_path 12_nms_score_gather_nd.onnx \
    --output_onnx_file_path 12_nms_score_gather_nd.onnx \
    --input_names gn_scores \
    --input_names gn_selected_indices \
    --input_shapes ${BATCHES} ${CLASSES} ${BOXES} \
    --input_shapes N 3 \
    --output_names final_scores \
    --output_shapes N 1

    onnxsim 12_nms_score_gather_nd.onnx 12_nms_score_gather_nd.onnx
    onnxsim 12_nms_score_gather_nd.onnx 12_nms_score_gather_nd.onnx

    ################################################### NonMaxSuppression + Score GatherND
    snc4onnx \
    --input_onnx_file_paths 09_nms_yolox_${BOXES}.onnx 12_nms_score_gather_nd.onnx \
    --srcop_destop scores gn_scores workaround_mul_out gn_selected_indices \
    --output_onnx_file_path 09_nms_yolox_${BOXES}_nd.onnx

    onnxsim 09_nms_yolox_${BOXES}_nd.onnx 09_nms_yolox_${BOXES}_nd.onnx
    onnxsim 09_nms_yolox_${BOXES}_nd.onnx 09_nms_yolox_${BOXES}_nd.onnx

    ################################################### Final Batch Nums
    python make_final_batch_nums_final_class_nums_final_box_nums.py

    ################################################### Boxes GatherND
    python make_box_gather_nd.py

    python -m tf2onnx.convert \
    --opset ${OPSET} \
    --tflite saved_model_postprocess/nms_box_gather_nd.tflite \
    --output 14_nms_box_gather_nd.onnx

    sor4onnx \
    --input_onnx_file_path 14_nms_box_gather_nd.onnx \
    --old_new ":0" "" \
    --search_mode "suffix_match" \
    --output_onnx_file_path 14_nms_box_gather_nd.onnx

    sor4onnx \
    --input_onnx_file_path 14_nms_box_gather_nd.onnx \
    --old_new "serving_default_input_1" "gn_boxes" \
    --output_onnx_file_path 14_nms_box_gather_nd.onnx \
    --mode inputs

    sor4onnx \
    --input_onnx_file_path 14_nms_box_gather_nd.onnx \
    --old_new "serving_default_input_2" "gn_box_selected_indices" \
    --output_onnx_file_path 14_nms_box_gather_nd.onnx \
    --mode inputs

    sor4onnx \
    --input_onnx_file_path 14_nms_box_gather_nd.onnx \
    --old_new "PartitionedCall" "final_boxes" \
    --output_onnx_file_path 14_nms_box_gather_nd.onnx \
    --mode outputs

    python make_input_output_shape_update.py \
    --input_onnx_file_path 14_nms_box_gather_nd.onnx \
    --output_onnx_file_path 14_nms_box_gather_nd.onnx \
    --input_names gn_boxes \
    --input_names gn_box_selected_indices \
    --input_shapes ${BATCHES} ${BOXES} 4 \
    --input_shapes N 2 \
    --output_names final_boxes \
    --output_shapes N 4

    onnxsim 14_nms_box_gather_nd.onnx 14_nms_box_gather_nd.onnx
    onnxsim 14_nms_box_gather_nd.onnx 14_nms_box_gather_nd.onnx

    ################################################### nms_yolox_xxx_nd + nms_final_batch_nums_final_class_nums_final_box_nums
    snc4onnx \
    --input_onnx_file_paths 09_nms_yolox_${BOXES}_nd.onnx 13_nms_final_batch_nums_final_class_nums_final_box_nums.onnx \
    --srcop_destop workaround_mul_out bc_input \
    --op_prefixes_after_merging main01 sub01 \
    --output_onnx_file_path 15_nms_yolox_${BOXES}_split.onnx

    ################################################### nms_yolox_${BOXES}_split + nms_box_gather_nd
    snc4onnx \
    --input_onnx_file_paths 15_nms_yolox_${BOXES}_split.onnx 14_nms_box_gather_nd.onnx \
    --srcop_destop x1y1x2y2 gn_boxes final_box_nums gn_box_selected_indices \
    --output_onnx_file_path 16_nms_yolox_${BOXES}_merged.onnx

    onnxsim 16_nms_yolox_${BOXES}_merged.onnx 16_nms_yolox_${BOXES}_merged.onnx
    onnxsim 16_nms_yolox_${BOXES}_merged.onnx 16_nms_yolox_${BOXES}_merged.onnx

    ################################################### nms output merge
    python make_nms_outputs_merge.py

    onnxsim 17_nms_batchno_classid_x1y1x2y2_cat.onnx 17_nms_batchno_classid_x1y1x2y2_cat.onnx

    ################################################### merge
    snc4onnx \
    --input_onnx_file_paths 16_nms_yolox_${BOXES}_merged.onnx 17_nms_batchno_classid_x1y1x2y2_cat.onnx \
    --srcop_destop final_batch_nums cat_batch final_class_nums cat_classid final_scores cat_score final_boxes cat_x1y1x2y2 \
    --output_onnx_file_path 18_nms_yolox_${BOXES}.onnx

    onnxsim 18_nms_yolox_${BOXES}.onnx 18_nms_yolox_${BOXES}.onnx

    ################################################### yolox + Post-Process
    snc4onnx \
    --input_onnx_file_paths ${MODEL_NAME}_${SUFFIX}${H}x${W}.onnx 18_nms_yolox_${BOXES}.onnx \
    --srcop_destop output predictions \
    --output_onnx_file_path ${MODEL_NAME}_post_${SUFFIX}${H}x${W}.onnx
    onnxsim ${MODEL_NAME}_post_${SUFFIX}${H}x${W}.onnx ${MODEL_NAME}_post_${SUFFIX}${H}x${W}.onnx
    onnxsim ${MODEL_NAME}_post_${SUFFIX}${H}x${W}.onnx ${MODEL_NAME}_post_${SUFFIX}${H}x${W}.onnx

    ################################################### cleaning
    rm 0*_*.onnx
    rm 1*_*.onnx
done
PINTO0309 commented 3 weeks ago

Fix: https://github.com/PINTO0309/PINTO_model_zoo/pull/414

18_nms_yolox_6300.onnx.zip

sit4onnx -if 18_nms_yolox_6300.onnx -oep cpu

INFO: file: 18_nms_yolox_6300.onnx
INFO: providers: ['CPUExecutionProvider']
INFO: input_name.1: predictions shape: [30, 6300, 17] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time:  420.7580089569092 ms
INFO: avg elapsed time per pred:  42.07580089569092 ms
INFO: output_name.1: batchno_classid_score_x1y1x2y2 shape: [7200, 7] dtype: float32

image

image

image

ozayr commented 3 weeks ago

getting

InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from yolox_s_wholebody12_0190_post_30x3x480x640.onnx failed:Protobuf parsing failed.

PINTO0309 commented 3 weeks ago

Your model body just hasn't been converted to 30 batches.

onnxsim yolox_s_wholebody12_Nx3xHxW.onnx yolox_s_wholebody12_30x3x480x640.onnx \
--overwrite-input-shape "input:30,3,480,640"

image

ozayr commented 3 weeks ago

correct it works , appreciate very much.

the output comes out as Nx7

should it not be 30xNx7 ? ie a set of detections relating to each image

any idea why batch runs significantly more slower then running a single image , I just assumed it would be much faster

PINTO0309 commented 3 weeks ago

should it not be 30xNx7 ? ie a set of detections relating to each image

No. All batch processing results are included.

output: batchno_classid_score_x1y1x2y2 float32[N,7]

any idea why batch runs significantly more slower then running a single image

Seriously, read the README. If you don't like slow processing speed, use EfficientNMS-TRT. To begin with, there are too many boxes for output targets.

https://github.com/PINTO0309/PINTO_model_zoo/blob/main/449_YOLOX-WholeBody12/README.md#3-test