Open AlexPasqua opened 4 months ago
I'd like to submit a PR, so I would discuss here how to implement this feature.
From what I've seen, in YOLOv5, the output of model.predict(...)
was an object of type Detections
, while in YOLOv8 it's a simple PyTorch Tensor (torch.Tensor
). Correct me if I'm wrong.
Since we cannot simply implement a new pandas()
method in the torch.Tensor
class, as we would do with a custom class like Detections
, I'd propose two alternatives.
Create a new method in YOLO(Model)
that takes the model's output and returns what pandas().xyxy[0]
would return.
model = YOLO('yolov8n.pt')
results = model.predict(image)
# extract the coordinates from the results as YOLOv5's results.pandas().xyxy[0]
coords = model.extract_coords(results)
Make the model return a custom object as output instead of a torch.Tensor
.
I would be similar to having a Detections
class again, so we would have the freedom to implement our custom methods directly in the class representing the model's output.
@AlexPasqua thank you for your willingness to contribute and for outlining your proposed solutions! Your initiative is greatly appreciated by the YOLO community and the Ultralytics team. Let's delve into your suggestions:
YOLO(Model)
Creating a new method in the YOLO
class to extract coordinates from the model's output is a practical approach. This method could transform the torch.Tensor
output into a more user-friendly format, similar to the pandas().xyxy[0]
in YOLOv5. Here's a concise example of how this could be implemented:
class YOLO:
# Existing methods...
def extract_coords(self, results):
coords = []
for box, score, cls in zip(results.boxes, results.scores, results.classes):
coords.append({
'x1': float(box[0]),
'y1': float(box[1]),
'x2': float(box[2]),
'y2': float(box[3]),
'confidence': float(score),
'class': int(cls)
})
return coords
# Usage
model = YOLO('yolov8n.pt')
results = model.predict(image)
coords = model.extract_coords(results)
Returning a custom object instead of a torch.Tensor
would indeed provide more flexibility. This approach aligns with the design of YOLOv5's Detections
class, allowing for the implementation of custom methods directly within the output class. This could look something like:
class CustomDetections:
def __init__(self, boxes, scores, classes):
self.boxes = boxes
self.scores = scores
self.classes = classes
def pandas(self):
import pandas as pd
data = {
'x1': self.boxes[:, 0],
'y1': self.boxes[:, 1],
'x2': self.boxes[:, 2],
'y2': self.boxes[:, 3],
'confidence': self.scores,
'class': self.classes
}
return pd.DataFrame(data)
class YOLO:
# Existing methods...
def predict(self, image):
# Perform prediction...
boxes, scores, classes = self.model(image)
return CustomDetections(boxes, scores, classes)
# Usage
model = YOLO('yolov8n.pt')
results = model.predict(image)
coords_df = results.pandas().xyxy[0]
Both alternatives have their merits. The first is simpler and less intrusive, while the second offers greater flexibility and aligns more closely with the design philosophy of YOLOv5.
Feel free to choose the approach that best fits your vision and submit a PR. The community and the Ultralytics team will be happy to review and provide feedback. If you need any further assistance or have more questions, don't hesitate to ask!
Thank you again for your contribution! 🚀
For more detailed guidance on contributing, you can refer to our Contributing Guide.
That's already available. result.boxes.xyxy
That's already available.
result.boxes.xyxy
Hi @Y-T-G, the focus of this feature request is actually reducing the amount of manual code when returning the detections results in a setting where the model is deployed somewhere and accessed through an API (e.g., using FastAPI). In this case, in fact, you still need to re-elaborate what result[0].boxes.xyxy
gives you.
What's currently needed:
import cv2
import numpy as np
from ultralytics import YOLO
from fastapi import FastAPI, File, UploadFile
app = FastAPI()
model = YOLO('yolov8n.pt') # Load model at startup
@app.post("/detect")
async def detect(file: UploadFile):
# Process the uploaded image for object detection
image_bytes = await file.read()
image = np.frombuffer(image_bytes, dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
# Perform object detection with YOLOv8
results = model.predict(image)
# Extract bounding box data
boxes = results[0].boxes.xyxy.cpu().numpy()
scores = results[0].boxes.conf.cpu().numpy()
classes = results[0].boxes.cls.cpu().numpy()
# Format the results as a list of dictionaries
json_output = []
for box, score, cls in zip(boxes, scores, classes):
json_output.append({
'x1': box[0],
'y1': box[1],
'x2': box[2],
'y2': box[3],
'confidence': score,
'class': int(cls)
})
return json_output
What this feature requests aims for:
import cv2
import numpy as np
from ultralytics import YOLO
from fastapi import FastAPI, File, UploadFile
app = FastAPI()
model = YOLO('yolov8n.pt') # Load model at startup
@app.post("/detect")
async def detect(file: UploadFile):
# Process the uploaded image for object detection
image_bytes = await file.read()
image = np.frombuffer(image_bytes, dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
# Perform object detection with YOLOv8
results = model.predict(image)
# Extract bounding box data as a pandas Dataframe and use pandas' "to_json" function
json_output = results.pandas('xyxy').to_json()
return json_output
Maybe the title or description of the issue weren't too clear, but if you look at the discussion linked above (#8235 and this comment) you could get more context 😃
From what I've seen, in YOLOv5, the output of
model.predict(...)
was an object of typeDetections
, while in YOLOv8 it's a simple PyTorch Tensor (torch.Tensor
). Correct me if I'm wrong.
In the end I was wrong indeed. The output in YOLOv8 is actually a custom object (like YOLOv5's Detections
) called Results
, so I would opt for the Alternative 2.
Since the output is actually a custom object (Results
), I would add methods there.
results
is a list with 1 element;results[0]
has an attribute boxes
;boxes
contains the attributes:
xyxy
: to access the coordinates in the format [x1, y1, x2, y2];xywh
: to access the coordinates in the format [x, y, width, height];xyxyn
: to get normalized [x1, y1, x2, y2] boxes, relative to orig_shape
;xywhn
: to get normalized [x, y, width, height] boxes, relative to orig_shape
I would create a method Results.pandas
, which elaborates the data into boxes
and returns a pandas Dataframe where:
boxes
has various attributes to get the coordinates in different formats (xyxy
, xywh
, xyxyn
, xywhn
), I would let the method pandas()
take an argument to decide which format to use.
results.pandas('xyxy')
or results.pandas('xywhn')
.Once we have this dataframe, we could use pandas' to_json()
method to get the results in json format, directly returnable by out API 😄
Check the above code snipped in this message (What this feature request aims for) for a contextualized example.
This way, we can have the output in a dataframe format, which might be useful for various use-cases, and if we need the outoput in json (e.g., in the super common use-case where the model is deployed and accessed through an API), we can pass by the .pandas()
method and then use pandas' to_json()
method without the need to implement something more.
Or something like that... @glenn-jocher let me know what you think 😄
Hi @AlexPasqua-G,
Thank you for your input! The feature request aims to streamline the process of converting detection results into a JSON format, which is particularly useful when deploying models via APIs, such as with FastAPI.
Currently, extracting and formatting the detection results involves several manual steps, as shown in the provided example. This process can be cumbersome, especially when frequently deploying models in production environments.
The goal is to simplify this workflow by introducing a method that directly converts the detection results into a pandas DataFrame, which can then be easily converted to JSON. This would reduce the amount of boilerplate code and make the deployment process more efficient.
Given that the output in YOLOv8 is a custom Results
object, we can add a method to this class to facilitate the conversion. Here's a concise plan:
Add a pandas
Method to Results
Class:
xyxy
, xywh
, etc.).Usage Example:
import cv2
import numpy as np
from ultralytics import YOLO
from fastapi import FastAPI, File, UploadFile
app = FastAPI()
model = YOLO('yolov8n.pt') # Load model at startup
@app.post("/detect")
async def detect(file: UploadFile):
# Process the uploaded image for object detection
image_bytes = await file.read()
image = np.frombuffer(image_bytes, dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
# Perform object detection with YOLOv8
results = model.predict(image)
# Extract bounding box data as a pandas DataFrame and convert to JSON
json_output = results.pandas('xyxy').to_json()
return json_output
This enhancement will make it more convenient for users to deploy YOLOv8 models in real-world applications, particularly those involving APIs. If you have any further suggestions or feedback, please let us know! 😊
Alright @glenn-jocher, then I'll proceed to open a PR about it 🚀
Hi @AlexPasqua,
That sounds fantastic! 🚀 We're excited to see your contribution. When you're ready, please go ahead and submit the PR. If you need any assistance or have further questions during the process, feel free to reach out here. Your efforts to enhance the usability of YOLOv8 are greatly appreciated by the community and the Ultralytics team. Thank you! 😊
Best of luck with the PR, and we're looking forward to reviewing it!
@glenn-jocher
Actually the Results.summary()
method does something similar: it returns a list of dictionaries, with a dict for each detected object, e.g., [{name='person', class=0, confidence=0.92, box={x1': 207.03125, 'y1': 55.68618, 'x2': 243.46902, 'y2': 153.60498}}]
.
This makes possible to do something like:
import cv2
import numpy as np
from ultralytics import YOLO
from fastapi import FastAPI, File, UploadFile
app = FastAPI()
model = YOLO('yolov8n.pt') # Load model at startup
@app.post("/detect")
async def detect(file: UploadFile):
# Process the uploaded image for object detection
image_bytes = await file.read()
image = np.frombuffer(image_bytes, dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
# Perform object detection with YOLOv8
results = model.predict(image)
# Extract bounding box data
summary_output = results.summary()
Then if you want a json output you can still do return pd.Dataframe(summary_output).to_json()
.
The only difference with YOLOv5's pandas method is that we don't have flexibility on choosing whether we want the coordinates in (x1, y1, x2, y2)
format or (x1, y1, w, h)
and so on.
Maybe, instead of implementing this pandas method in YOLOv8, I should modify the Results.summary()
method so that we can pass to it the desired coords format. I created a PR for this, could you give me an initial opinion or review? (#14946)
@AlexPasqua thank you for pointing out the Results.summary()
method and its capabilities. You're correct that it provides a similar functionality by returning a list of dictionaries for each detected object. Modifying the Results.summary()
method to accept a parameter for the desired coordinate format is a practical approach. This would indeed align more closely with the flexibility offered by YOLOv5's pandas method.
I appreciate your initiative in creating a PR for this enhancement. I'll review your PR (#14946) and provide feedback shortly. This improvement will certainly enhance the usability of the Results
class for various deployment scenarios. Thank you for your contribution!
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
* **Docs**: https://docs.ultralytics.com * **HUB**: https://hub.ultralytics.com * **Community**: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐
There's a PR open (#14946) to close this issue, but it's waiting for a review
Search before asking
Description
In YOLOv5, you could have the boxes' coordinates in dataframe format with a simple
results.pandas().xyxy[0]
, and then get them in json by simply adding.to_json()
at the end. Returning the coordinates in json format is usually needed in the super common use-case where the model is deployed and accessed through an API.As discussed in #8235 (this comment specifically), this feature could benefit many users! 😄
Use case
Get the coordinates in json format with something like
results.pandas().xyxy[0].to_json()
instead of what's curerntly needed, i.e.:Are you willing to submit a PR?