Closed ozayr closed 3 weeks ago
Fix: https://github.com/PINTO0309/PINTO_model_zoo/pull/414
sit4onnx -if 18_nms_yolox_6300.onnx -oep cpu
INFO: file: 18_nms_yolox_6300.onnx
INFO: providers: ['CPUExecutionProvider']
INFO: input_name.1: predictions shape: [30, 6300, 17] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time: 420.7580089569092 ms
INFO: avg elapsed time per pred: 42.07580089569092 ms
INFO: output_name.1: batchno_classid_score_x1y1x2y2 shape: [7200, 7] dtype: float32
getting
InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from yolox_s_wholebody12_0190_post_30x3x480x640.onnx failed:Protobuf parsing failed.
Your model body just hasn't been converted to 30 batches.
onnxsim yolox_s_wholebody12_Nx3xHxW.onnx yolox_s_wholebody12_30x3x480x640.onnx \
--overwrite-input-shape "input:30,3,480,640"
correct it works , appreciate very much.
the output comes out as Nx7
should it not be 30xNx7 ? ie a set of detections relating to each image
any idea why batch runs significantly more slower then running a single image , I just assumed it would be much faster
should it not be 30xNx7 ? ie a set of detections relating to each image
No. All batch processing results are included.
output: batchno_classid_score_x1y1x2y2 float32[N,7]
any idea why batch runs significantly more slower then running a single image
Seriously, read the README. If you don't like slow processing speed, use EfficientNMS-TRT
. To begin with, there are too many boxes for output targets.
https://github.com/PINTO0309/PINTO_model_zoo/blob/main/449_YOLOX-WholeBody12/README.md#3-test
Post-Process
Because I add my own post-processing to the end of the model, which can be inferred by TensorRT, CUDA, and CPU, the benchmarked inference speed is the end-to-end processing speed including all pre-processing and post-processing. EfficientNMS in TensorRT is very slow and should be offloaded to the CPU.
param | value | note |
---|---|---|
max_output_boxes_per_class | 20 | Maximum number of outputs per class of one type. 20 indicates that the maximum number of people detected is 20 , the maximum number of heads detected is 20 , and the maximum number of hands detected is 20 . The larger the number, the more people can be detected, but the inference speed slows down slightly due to the larger overhead of NMS processing by the CPU. In addition, as the number of elements in the final output tensor increases, the amount of information transferred between hardware increases, resulting in higher transfer costs on the hardware circuit. Therefore, it would be desirable to set the numerical size to the minimum necessary. |
iou_threshold | 0.40 | A value indicating the percentage of occlusion allowed for multiple bounding boxes of the same class. 0.40 is excluded from the detection results if, for example, two bounding boxes overlap in more than 41% of the area. The smaller the value, the more occlusion is tolerated, but over-detection may increase. |
score_threshold | 0.25 | Bounding box confidence threshold. Specify in the range of 0.00 to 1.00 . The larger the value, the stricter the filtering and the lower the NMS processing load, but in exchange, all but bounding boxes with high confidence values are excluded from detection. This is a parameter that has a very large percentage impact on NMS overhead. |
Issue Type
Support
OS
Mac OS
OS architecture
aarch64
Programming Language
Other
Framework
ONNX
Model name and Weights/Checkpoints URL
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/449_YOLOX-WholeBody12
Description
Hi, thank you for the work
When using post process gen tools how do I increase the batch size parameter to create a model for batch processing
I see the model generated but the output shape Is not changed, the log shows correctly
but when running the model and checking input shapes it still shows [1,3,480,640] and not [30,3,480,640]
Relevant Log Output
URL or source code for simple inference testing code
the Script
I changed
as bash was complaining