ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.54k stars 16.3k forks source link

How to get the coordinates of the bounding box in YOLO object detection? #388

Closed milind-soni closed 4 years ago

milind-soni commented 4 years ago

❔Question

I need to get the bounding box coordinates generated in an image using the object detection. How do I achieve that

HankSteel77 commented 10 months ago

Hi @glenn-jocher, I wanted to ask you if there is any possibility to extract detected image as nparray to carry out further processing, for example ocr. I saw that results.crop(save=False) returns 'Im', what seems to be image array containging information about detected object limited by the boundingbox.

glenn-jocher commented 10 months ago

@HankSteel77 yes, it is indeed possible to extract the detected image as an nparray for further processing such as OCR. You can achieve this by using the results.crop() method, which returns an image array containing the detected object limited by the bounding box. This can then be used for downstream tasks such as OCR. If you have any more questions or need additional assistance, feel free to ask!

HankSteel77 commented 10 months ago

@glenn-jocher Thanks for fast answer! I wanted to ask if there is build in functionality that allows to extract data as array like results.ims[0] or something that extends functionality of results.crop. I searched the common.py file but i had trouble to understand how to simply obtain information only about 'Im' array. In a nutshell, i just wanted to know if there is build in function that allows to receive information about 'Im' in "air', without saving the and parsing data. I would be grateful if you told me how to do it.

glenn-jocher commented 10 months ago

@HankSteel77 🎉 Yes, there is a straightforward built-in functionality in YOLOv5 that enables you to obtain the "Im" array without having to save and parse the data. You can access the "Im" array directly using the results.render() method. This method returns a list of cropped image arrays (not saved) from the results, making it simple and efficient to obtain the desired "Im" information. If you have any more questions or need further assistance, feel free to ask!

Manaschintawar commented 7 months ago

i want to extract the class id from the bounding box but results = model.predict(frame) a = results[0].boxes.xyxy
px = pd.DataFrame(a).astype("float")

for index, row in px.iterrows():
    print(px) this is the code and when i am printing the value i dont get the class id in the last row of the extract data now for class id what should i do? plz help 😣!!
glenn-jocher commented 7 months ago

@Manaschintawar hello! To extract the class ID along with the bounding box details, you can access the results[0].pandas().xyxy attribute, which provides a Pandas DataFrame including class IDs. Here's how you can modify your code:

results = model.predict(frame)
df = results[0].pandas().xyxy[0]  # results in a DataFrame

for index, row in df.iterrows():
    print(row['class'])  # This prints the class ID

This will give you the class ID for each detected object in your frame. If you have any more questions, feel free to ask!

Manaschintawar commented 6 months ago

what is the syntax for yolov8??

On Fri, 29 Mar 2024 at 01:32, Glenn Jocher @.***> wrote:

@Manaschintawar https://github.com/Manaschintawar hello! To extract the class ID along with the bounding box details, you can access the results[0].pandas().xyxy attribute, which provides a Pandas DataFrame including class IDs. Here's how you can modify your code:

results = model.predict(frame)df = results[0].pandas().xyxy[0] # results in a DataFrame for index, row in df.iterrows(): print(row['class']) # This prints the class ID

This will give you the class ID for each detected object in your frame. If you have any more questions, feel free to ask!

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov5/issues/388#issuecomment-2026012798, or unsubscribe https://github.com/notifications/unsubscribe-auth/BBPKZ6BYN6WZTHJQJJC7PHLY2RSOFAVCNFSM4OYYCLIKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBSGYYDCMRXHE4A . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 6 months ago

@Manaschintawar hey there! YOLOv8 employs similar syntax for working with results as YOLOv5. For extracting class IDs and bounding boxes, you can use the results.pandas().xyxy[0] attribute after making predictions. Here's a quick example:

results = model.predict(frame)
df = results.pandas().xyxy[0]  # Obtain a DataFrame with detection details

for index, row in df.iterrows():
    print(row['class'])  # Print out the class ID

If YOLOv8 has brought any specific changes or additional features, I recommend checking out the official documentation or the updates section of the GitHub repo. Let me know if there's anything else you need! 🌟

Muhnajh commented 5 months ago

@milind-soni detection results are available here:

https://github.com/ultralytics/yolov5/blob/ea34f848a6afbe1fc0010745fdc5f356ed871909/detect.py#L92-L102

hai sir, can you give code for appear the coordinate with bounding box after Run evaluation image, which part I must added from the code roboflow generate for Yolov5 which run at google colab

tejasri19 commented 2 months ago

results = model("path to image")
boxes = []
scores = []
for box in results[0].boxes:
    cords = box.xyxy[0].tolist()
    x1, y1, x2, y2 = [round(x) for x in cords]
    score = box.conf[0].item()  # Assuming the confidence score is available in box.conf
    cls = results[0].names[box.cls[0].item()]
    boxes.append([x1, y1, x2, y2, score, cls])
    scores.append(score)

print("Boxes:", boxes)
print("Scores:", scores
```)

This is working.
glenn-jocher commented 2 months ago

Hello @tejasri19,

Great to hear that you have a working solution! If you want to display the coordinates of the bounding boxes on the evaluation image, you can modify your code to include drawing the bounding boxes on the image. Here's an example of how you can achieve this using OpenCV:

import cv2
import matplotlib.pyplot as plt

# Load the image
image_path = "path to image"
image = cv2.imread(image_path)

# Perform inference
results = model(image_path)

# Extract bounding box coordinates and class names
boxes = []
scores = []
for box in results[0].boxes:
    cords = box.xyxy[0].tolist()
    x1, y1, x2, y2 = [round(x) for x in cords]
    score = box.conf[0].item()  # Assuming the confidence score is available in box.conf
    cls = results[0].names[box.cls[0].item()]
    boxes.append([x1, y1, x2, y2, score, cls])
    scores.append(score)

    # Draw the bounding box on the image
    cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
    cv2.putText(image, f'{cls} {score:.2f}', (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)

# Display the image with bounding boxes
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

print("Boxes:", boxes)
print("Scores:", scores)

This code will draw the bounding boxes and class labels on the image and display it using matplotlib. If you encounter any issues or have further questions, feel free to ask! 😊