Same structure for API and library results

nengelmann commented 1 year ago

Search before asking

[X] I have searched the Inference issues and found no similar feature requests.

Description

The API (http) inference and the library (pip install) inference models have different result structures. It would be neat if they both return results with the structure of the API inference. This way the approaches are exchangeable and, more important, the results of the library inference can be used with supervision, as the API inference results.

Use case

Let's take the quickstart example as usecase. If you run the examples (ready to run scripts further down below), the following results are returned.

The API inference result:

{
   "time":0.2851218069990864,
   "image":{
      "width":398,
      "height":224
   },
   "predictions":[
      {
         "x":5.5,
         "y":152.0,
         "width":11.0,
         "height":30.0,
         "confidence":0.9074968099594116,
         "class":"player",
         "class_id":1
      },
      {
         "x":145.0,
         "y":96.0,
         "width":14.0,
         "height":24.0,
         "confidence":0.8891444206237793,
         "class":"player",
         "class_id":1
      },
      ...
   ]
}

The library inference result:

[[
[0.0, 137.0, 11.0, 167.0, 0.9075440764427185, 0.8860160708427429, 1.0], 
[138.0, 84.0, 152.0, 108.0, 0.8891727924346924, 0.9700851440429688, 1.0], 
[17.0, 77.0, 33.0, 102.0, 0.8874990940093994, 0.9652542471885681, 1.0], 
[305.0, 162.0, 321.0, 194.0, 0.8796935081481934, 0.9643577337265015, 1.0], 
...
]]

The API result can be easily processed with supervision, while the library result needs a different logic. For a bounding box the logic is fairly simple but for polygons (instance segmentation) not necessarily. Also, it would be a lot more convenient to have the library results also integrated with supervision.

Please let me know if the library results can be converted into a supervision compatible format. I just couldn't find anything. From the documentation, it was not clear to me how to reformat the instance segmentation results for usage with supervision.

Thanks for the step of open sourcing 'inference'! If I can help with the implementation, I'm happy to help, but might need some guidance/feedback.

Additional

Examples ready to run

API Example

import requests
import cv2
import supervision as sv
from roboflow import Roboflow
import urllib
import numpy as np

ROBOFLOW_API_KEY = "YOUR_API_KEY"

# Roboflow quickstart example

dataset_id = "soccer-players-5fuqs"
version_id = "1"
image_url = (
    "https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg"
)

api_key = ROBOFLOW_API_KEY
confidence = 0.5

url = f"http://localhost:9001/{dataset_id}/{version_id}"

params = {
    "api_key": api_key,
    "confidence": confidence,
    "image": image_url,
}

res = requests.post(url, params=params)
print(res.json())

# Roboflow visualization with supervision

req = urllib.request.urlopen(image_url)
arr = np.asarray(bytearray(req.read()), dtype=np.uint8)
img = cv2.imdecode(arr, -1)

rf = Roboflow(api_key=api_key)
project = rf.workspace().project("soccer-players-5fuqs")
model = project.version(1).model
annotator = sv.BoxAnnotator()

class_list = ["player", "referee", "football"]
detections = sv.Detections.from_roboflow(res.json(), class_list)
labels = [
    f"{class_list[class_id]} {confidence_value:0.2f}"
    for _, _, confidence_value, class_id, _ in detections
]
annotated_frame = annotator.annotate(
    scene=img.copy(), detections=detections, labels=labels
)
cv2.imwrite("frame.jpg", img)
cv2.imwrite("annotated_frame.jpg", annotated_frame)

Library Example

from inference.models.utils import get_roboflow_model

ROBOFLOW_API_KEY = "YOUR_API_KEY"

# Roboflow quickstart example

model = get_roboflow_model(
    model_id="soccer-players-5fuqs/1",
    # Replace ROBOFLOW_API_KEY with your Roboflow API Key
    api_key=ROBOFLOW_API_KEY,
)

results = model.infer(
    image="https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg",
    confidence=0.5,
    iou_threshold=0.5,
)

print(results)

# Roboflow visualization with supervision

# Here a custom logic will be needed for visualizing the results.
# Bboxes can be done fairly easy and the format can be inferred from the result itself, but not necessarily for polygons / instance segmentation

Are you willing to submit a PR?

[X] Yes I'd like to help by submitting a PR!

yeldarby commented 1 year ago

Definitely agree with this @nengelmann! I'm doing the first step towards this with the new streaming interface which will return detections consumable by supervision (hopefully getting released today) and will follow up shortly updating the detections that come back from our other model APIs.

paulguerrie commented 10 months ago

Resolved by https://github.com/roboflow/inference/pull/147. Model infer() methods now return InferenceModelResponse objects, similar in structure to the JSON returned by the Roboflow inference API.

roboflow / inference