triton-inference-server / onnxruntime_backend

The Triton backend for the ONNX Runtime.
BSD 3-Clause "New" or "Revised" License
121 stars 54 forks source link

Error while Loading YOLOv8 Model with EfficientNMS_TRT Plugin in TRITON #210

Open whitewalker11 opened 1 year ago

whitewalker11 commented 1 year ago

Issue Description:

I am encountering an error while trying to load a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON. The specific error message I am receiving is:

vbnet

UNAVAILABLE: Internal: onnx runtime error 1: Load model from /models/yolov8_onnx/1/model.onnx failed: Fatal error: TRT:EfficientNMS_TRT(-1) is not a registered function/op

Steps to Reproduce:

Export YOLOv8 model with EfficientNMS_TRT plugin.
Attempt to load the exported model into TRITON.

Expected Behavior:

The YOLOv8 model with the EfficientNMS_TRT plugin should load into TRITON without any errors.

Actual Behavior:

Encountering the aforementioned error message when trying to load the model into TRITON.

Additional Information:

YOLOv8 model was exported with EfficientNMS_TRT plugin.
The error seems to be related to the EfficientNMS_TRT plugin not being registered.
TRITON version: 23.05-py3-sdk

YOLOv8 Model Export Code: 
        input_shape = [1, 3, 640, 640]
    device = 'cpu'
    weights = 'path_to_yolov8_weights.pt'
    topk = 100

    YOLOv8 = YOLO(weights)
    model = YOLOv8.model.fuse().eval()

    for m in model.modules():
        optim(m)
        m.to(device)

    model.to(device)
    fake_input = torch.randn(input_shape).to(device)

    model(fake_input)

    save_path = weights.replace('.pt', '.onnx')

    onnx_model = torch.onnx.export(
        model,
        fake_input,
        save_path,
        input_names=['images'],
        output_names=['num_dets', 'bboxes', 'scores', 'labels'])

    print(f'ONNX export success, saved as {save_path}')
TRITON Loading Code:
    platform: "onnxruntime_onnx"
max_batch_size: 0
input [
{
  name: "images"
  data_type: TYPE_FP32
  dims: [ 1,3,640,640 ]
}
]
output [
{
  name: "output0"
  data_type: TYPE_FP32
  dims: [-1, -1, -1]
}
]

Possible Solutions Attempted:

Verified that the EfficientNMS_TRT plugin is correctly included during model export.
Checked for any compatibility issues between the TRITON version and the ONNX Runtime version.

Request for Assistance:

I'm seeking guidance on how to properly load a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON. Any insights, suggestions, or steps to resolve this issue would be greatly appreciated. Thank you!

kthui commented 1 year ago

@tanmayv25 @oandreeva-nv Do you have some insights into loading a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON?

tanmayv25 commented 1 year ago

@whitewalker11 Did you try specifying the custom op plugin as specified here? https://github.com/triton-inference-server/server/blob/main/docs/user_guide/custom_operations.md#onnx