levan92 / deep_sort_realtime

A really more real-time adaptation of deep sort
MIT License
156 stars 46 forks source link

Cuda version #52

Closed michenriq closed 2 months ago

michenriq commented 2 months ago

I have a pre-built version of the opencv-contrib to run on the GPU and it's already working. However, I preferd using the deep_sort_realtime instead of the Deepsort it self.

But I could notice that the deep_sort_realtime uses the opencv-python instead of the opencv, which makes my script run on my CPU instead of GPU.

Below follows a simple script that is working already however only on my cpu.

Is there a way to build opencv-python for GPU as the oficial package?

import cv2
import numpy as np
from deep_sort_realtime.deepsort_tracker import DeepSort

# Load YOLO
net = cv2.dnn.readNet("yolov4.weights", "yolov4.cfg")
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

layer_names = net.getLayerNames()
output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers().flatten()]
classes = []
with open("coco.names", "r") as f:
    classes = [line.strip() for line in f.readlines()]

# Initialize Deep SORT
deepsort = DeepSort(max_age=30)

cap = cv2.VideoCapture('./p3.mp4')  # Use 0 for webcam or replace with video file path

while True:
    ret, frame = cap.read()
    if not ret:
        break

    height, width, channels = frame.shape

    # Detecting objects with YOLO
    blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
    net.setInput(blob)
    outs = net.forward(output_layers)

    class_ids = []
    confidences = []
    boxes = []

    for out in outs:
        for detection in out:
            scores = detection[5:]
            class_id = np.argmax(scores)
            confidence = scores[class_id]
            if confidence > 0.5 and class_id == 0:  # Check if the detected class is 'person'
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * height)
                w = int(detection[2] * width)
                h = int(detection[3] * height)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confidences.append(float(confidence))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
    if len(indexes) > 0 and isinstance(indexes, np.ndarray):
        detections = []
        for i in indexes.flatten():
            x, y, w, h = boxes[i]
            conf = confidences[i]
            detection_class = class_ids[i]
            detections.append(([x, y, w, h], conf, detection_class))

        # Update tracker with filtered boxes
        tracks = deepsort.update_tracks(detections, frame=frame)

        for track in tracks:
            if not track.is_confirmed():
                continue
            bbox = track.to_tlbr()
            track_id = track.track_id
            cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), (0, 255, 0), 2)
            cv2.putText(frame, f'ID: {track_id}', (int(bbox[0]), int(bbox[1]) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    cv2.imshow("Camera", frame)

    if cv2.waitKey(1) == ord("q"):
        break

cap.release()
cv2.destroyAllWindows()
levan92 commented 2 months ago

I think you can just install this library without enforcing the dependencies in requirements.txt through various methods. That way you can keep to your opencv build. End of the day if deep_sort_realtime can import cv2, it shouldn't be an issue. Let me know if there's still any issue I missed

michenriq commented 2 months ago

this library uses opencv-python instead. Does that imply at something?

levan92 commented 2 months ago

No, it's just a convenient package to install via pip instead of building opencv from source.