ultralytics / ultralytics

Ultralytics YOLO11 πŸš€
https://docs.ultralytics.com
GNU Affero General Public License v3.0
32.6k stars 6.28k forks source link

OBB model on triton server cause NMS error (probably not evaulated with task='obb' but task='detect') #11757

Closed wojciechpolchlopek closed 6 months ago

wojciechpolchlopek commented 6 months ago

Search before asking

YOLOv8 Component

Integrations

Bug

The local evaluation is correct on torchscript model but on triton seems to evaluate on task 'detect' instead of OBB I trained two models: with task='obb'

model = YOLOOBBWrapper(f'http://localhost:8000/yolo_obb_1', task='obb') where:

    class YOLOOBBWrapper:
    def __init__(self, model_url):
        self.model = YOLO(model_url, task='obb')

    def predict(self, image_path):
        return self.model(image_path)
--raw evaluation with `triton_client.async_infer `and 
 ultralytics.utils.ops.non_max_suppression(
    data,
    conf_thres=0.5,
    iou_thres=0.4,
    nc=2,
    rotated=True,
    agnostic=True ) ```

 -> I got only one rotated frame with low score

### Environment

Ultralytics YOLOv8.2.1 πŸš€ Python-3.9.19 torch-2.3.0+cu121 CPU (Intel Core(TM) i7-10870H 2.20GHz)
Setup complete βœ… (16 CPUs, 31.2 GB RAM, 691.3/697.5 GB disk)

OS                  Linux-5.15.0-105-generic-x86_64-with-glibc2.31
Environment         Linux
Python              3.9.19
Install             git
RAM                 31.16 GB
CPU                 Intel Core(TM) i7-10870H 2.20GHz
CUDA                None

matplotlib          βœ… 3.8.4>=3.3.0
opencv-python       βœ… 4.7.0.72>=4.6.0
pillow              βœ… 9.2.0>=7.1.2
pyyaml              βœ… 6.0.1>=5.3.1
requests            βœ… 2.31.0>=2.23.0
scipy               βœ… 1.13.0>=1.4.1
torch               βœ… 2.3.0>=1.8.0
torchvision         βœ… 0.18.0>=0.9.0
tqdm                βœ… 4.66.2>=4.64.0
psutil              βœ… 5.9.8
py-cpuinfo          βœ… 9.0.0
thop                βœ… 0.1.1-2209072238>=0.1.1
pandas              βœ… 2.2.2>=1.1.4
seaborn             βœ… 0.13.2>=0.11.0

Triton server config:
`docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/home/wojpol/models:/models nvcr.io/nvidia/tritonserver:23.09-py3 tritonserver --model-repository=/models
`
config.pbtxt:
`platform: "pytorch_libtorch"
max_batch_size: 10
input [
  {
    name: "inputs__0"
    data_type: TYPE_FP32
    dims: [ 3, 640, 640 ]
  }
]
output [
  {
    name: "output__0"
    data_type: TYPE_FP32
    dims: [-1]
  }
]
`

### Minimal Reproducible Example

`yolo export model=best-obb.pt format='torchscript' task='obb' imgsz=640`
`model = YOLOOBBWrapper(f'http://localhost:8000/yolo_obb_1', task='obb')` where:
class YOLOOBBWrapper:
def __init__(self, model_url):
    self.model = YOLO(model_url, task='obb')

def predict(self, image_path):
    return self.model(image_path)


### Additional

_No response_

### Are you willing to submit a PR?

- [ ] Yes I'd like to help by submitting a PR!
glenn-jocher commented 6 months ago

Hi! Thanks for reaching out with the details on your issue integrating the OBB model with the Triton server.

Based on your description, it sounds like there might be a mismatch in tensor dimensions or configuration between the TorchScript model and how Triton is set up to handle it, especially considering the negative dimension error.

Here’s a quick suggestion:

Here's a slightly adjusted snippet for instantiating your model:

class YOLOOBBWrapper:
    def __init__(self, model_url, task='obb'):
        self.model = YOLO(model_url, task=task)

    def predict(self, image_path):
        return self.model(image_path)

We are indeed here to help further if this adjustment doesn’t resolve the issue! 🌟

wojciechpolchlopek commented 6 months ago

Hi, I have found a reason in https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/yolo/obb/predict.py#L38 nc=len(self.model.names) is a large number - correcting this to real nc value (e.g. 2) leads to correct results. My question is: where model.names should be set to have correct length?

glenn-jocher commented 6 months ago

Hi! Great job diving into the code and identifying the workaround! 🌟

The model.names should reflect the class names found in your dataset. Typically, this is set when you load your model using the dataset's YAML file, where the number of classes (nc) and their respective names are specified.

If you are directly loading a model without associating it with a specific data YAML, you can manually adjust model.names after loading your model as follows:

from ultralytics import YOLO

# Load your model
model = YOLO('your_model.pt')

# Set the correct class names
model.names = ['class1', 'class2']  # Update this list with your actual class names

Ensure the names are consistent with the nc value and your class labels. That should align everything correctly! Let us know if this helps or if you need any more details!

wojciechpolchlopek commented 6 months ago

Thank for your help. As a the workaround I finally used nc = detection.shape[1] - 5 for nms algorithm and my custom triton evaluation code with triton_client.async_infer works fine :) The Issue could be closed, but... Please consider this as a potential bug because exported model has proper model.names in the inner config.txt e.g. for nc=2: {"description": "Ultralytics YOLOv8m-obb model trained on config.yaml", "author": "Ultralytics", "date": "2024-05-10T07:38:22.845204", "version": "8.2.1", "license": "AGPL-3.0 License (https://ultralytics.com/license)", "docs": "https://docs.ultralytics.com", "stride": 32, "task": "obb", "batch": 1, "imgsz": [640, 640], "names": {"0": "var", "1": "dontcare"}}

glenn-jocher commented 6 months ago

Hi there! πŸ‘‹ Great to hear you've found a workaround that suits your needs for now. We appreciate you sharing it!

Thanks also for pointing out this discrepancy with how model.names is being handled for your use case. That's definitely something we'll consider investigating deeper to ensure consistency and correctness in exported model configurations. I’ll pass your feedback to our team.

For now, don't hesitate to reach out if you encounter any other issues or have further suggestions. Thank you for contributing to the YOLOv8 community by sharing these insights! πŸš€