microsoft / onnxruntime-extensions

onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime
MIT License
323 stars 84 forks source link

Error when using ppp code on yolov8n model #626

Closed arseniymerkulov closed 8 months ago

arseniymerkulov commented 9 months ago

I have issue similair to this. I am trying to add pre-post processing steps to finetuned yolov8n, model in onnx attached in the archive. I am using code from here. It works with default pretrained yolov8n, but fails on inference test with finetuned yolo with error:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 
6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Split node. 
Name:'post_process_14' Status Message: D:\a\_work\1\s\onnxruntime\core\framework\op_kernel.cc:83 onnxruntime::OpKernelContext::OutputMLValue status.IsOK() was false. 
Tensor shape cannot contain any negative value

What i can do to resolve this? yolov8n_sku110k_10_epoch.zip

wenbingl commented 9 months ago

@skottmckay, can you help with that?

skottmckay commented 9 months ago

The code was tested with the default yolo v8 and expects there to be 80 classes. Does the 'finetuned' version only emit 1 class, resulting in the output being {1, 5, 8400} instead of the expected {1, 84, 8400}?

If so, you could try adding , num_classes=1 to the call to yolo_detection here.

arseniymerkulov commented 9 months ago

Thank you for your answer. I converted model to onnx and add pre\post processing steps to it. In the attached images you can see results of original model (many bboxes) and results of model with ppp steps (1 bbox). Are the boxes inside the model filtered somehow? If so, how can i manage confidence threshold, etc.?

2880799 2880799_output

arseniymerkulov commented 9 months ago

Another question, is there an API to add another output channel to a model, with standart bboxes, scores, classes?

skottmckay commented 8 months ago

These are questions for the model author. I don't know what the 'finetuned' version is. It seems to only produce a single result, but how it determines that is internal to the model. i.e. the onnxruntime post-processing is not involved in selecting the single result.

Same with adding new outputs to the model.

Basically we're not changing the original model - we're adding pre-processing and post-processing to the original model as-is.

arseniymerkulov commented 8 months ago

As far as I understand, the results (number of results) of the model before and after adding ppp steps should not differ, correct? Provided that in the first case, preprocessing was performed by the native ultralytics framework, and the preprocessing steps added to the model must correspond to it.

In my example you can see that the model results before adding ppp steps and after are different. Seems like the source of this should be the added steps

skottmckay commented 8 months ago

The SelectBestBoundingBoxes post-processing step has a score_threshold argument you can set for the minimum threshold. You could try setting that to a lower value than the (arbitrary) default of 0.67.

Set the value here by adding an argument to the call (e.g. SelectBestBoundingBoxesByNMS(score_threshold=0.1). You can modify the python file in the onnxruntime_extensions package directly.

There are other configurable values for the ONNX NonMaxSupression as well. That is used to select the best bounding boxes from the results by removing overlapping ones or ones with low scores.

arseniymerkulov commented 8 months ago

Everything is resolved with the thresholds, thank you. Regarding the model outputs: I removed 274-280 lines from here to get the post-processed coordinates of the bboxes instead of bitmap image. Seems like coords not quite match the original image:

2903744_output

Can you guide me towards what step is missing to match coords with the original image?

skottmckay commented 8 months ago

Are you using the correct co-ordinate format to draw the bounding boxes? There are a few different ones as mentioned here:

https://github.com/microsoft/onnxruntime-extensions/blob/4f6e12579cfd190e5810d4c8ba901ca5c767be7a/onnxruntime_extensions/tools/pre_post_processing/steps/vision.py#L629-L640

As you can see from the code you removed the format is CENTER_XYWH for YOLO.

arseniymerkulov commented 8 months ago

I was drawing it from corner, now problem is resolved. Thank you