ultralytics / ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
28.7k stars 5.7k forks source link

model track always after detect, can i detect once and just use tracker to get next frames boxes? #14133

Closed Rane2021 closed 2 months ago

Rane2021 commented 2 months ago

Search before asking

Question

model track always after detect, can i detect once and just use tracker to get next frames boxes?

Additional

model track always after detect, can i detect once and just use tracker to get next frames boxes?

glenn-jocher commented 2 months ago

@Rane2021 hello,

Thank you for your question! In the current implementation of Ultralytics YOLO, the track mode indeed performs detection on each frame before tracking. This ensures that the tracker has updated information about object locations and classes in every frame.

However, if you want to detect objects in the first frame and then only track these objects in subsequent frames without performing detection again, you would need to modify the workflow. This approach can be useful for scenarios where detection is computationally expensive, and the scene does not change significantly between frames.

Here's a basic example of how you might achieve this using the YOLO model and OpenCV:

import cv2
from ultralytics import YOLO

# Load the YOLOv8 model
model = YOLO("yolov8n.pt")

# Open the video file
video_path = "path/to/video.mp4"
cap = cv2.VideoCapture(video_path)

# Detect objects in the first frame
ret, first_frame = cap.read()
if ret:
    initial_results = model(first_frame)
    initial_boxes = initial_results[0].boxes.xywh.cpu()  # Get initial bounding boxes

# Loop through the video frames
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Use the initial detection results to track objects in subsequent frames
    results = model.track(frame, persist=True, initial_boxes=initial_boxes)

    # Visualize the results on the frame
    annotated_frame = results[0].plot()

    # Display the annotated frame
    cv2.imshow("YOLOv8 Tracking", annotated_frame)

    # Break the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

In this example:

  1. We perform detection on the first frame to get the initial bounding boxes.
  2. For subsequent frames, we use the track method with the persist=True argument to track the objects based on the initial detection.

Please note that this is a simplified example, and you may need to adjust it based on your specific use case and requirements.

For more detailed information on tracking and configuration options, you can refer to our tracking documentation.

I hope this helps! If you have any further questions, feel free to ask. 😊

Rane2021 commented 2 months ago

Very nice, Thanks!

Rane2021 commented 2 months ago

SyntaxError: 'initial_boxes' is not a valid YOLO argument.

glenn-jocher commented 2 months ago

Hello @Rane2021,

Thank you for pointing that out! It looks like I made an error in my previous response. The initial_boxes argument is not valid for the track method in the current implementation of Ultralytics YOLO.

To achieve the desired functionality of detecting objects in the first frame and then tracking them in subsequent frames, you would need to handle the tracking logic manually. Here's an updated example that demonstrates how to do this using OpenCV and a simple tracking algorithm like cv2.TrackerKCF_create():

import cv2
from ultralytics import YOLO

# Load the YOLOv8 model
model = YOLO("yolov8n.pt")

# Open the video file
video_path = "path/to/video.mp4"
cap = cv2.VideoCapture(video_path)

# Detect objects in the first frame
ret, first_frame = cap.read()
if ret:
    initial_results = model(first_frame)
    initial_boxes = initial_results[0].boxes.xyxy.cpu().numpy()  # Get initial bounding boxes

    # Initialize OpenCV trackers
    trackers = cv2.MultiTracker_create()
    for box in initial_boxes:
        x, y, x2, y2 = box
        w, h = x2 - x, y2 - y
        tracker = cv2.TrackerKCF_create()
        trackers.add(tracker, first_frame, (x, y, w, h))

# Loop through the video frames
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Update the trackers
    success, boxes = trackers.update(frame)

    # Draw the tracked boxes
    for i, new_box in enumerate(boxes):
        x, y, w, h = [int(v) for v in new_box]
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)

    # Display the annotated frame
    cv2.imshow("YOLOv8 Tracking", frame)

    # Break the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

In this example:

  1. We perform detection on the first frame to get the initial bounding boxes.
  2. We initialize OpenCV trackers for each detected object.
  3. For subsequent frames, we update the trackers and draw the tracked boxes.

This approach leverages OpenCV's tracking algorithms to maintain object tracking across frames after the initial detection.

If you encounter any issues or have further questions, please provide a minimum reproducible example to help us better understand and address your problem. You can find more information on creating a reproducible example here.

Additionally, please ensure you are using the latest version of the Ultralytics YOLO package to benefit from the latest features and bug fixes.

I hope this helps! If you have any further questions, feel free to ask. 😊