How can I optimize my streaming?

REZIZ-TER commented 4 months ago

Search before asking

[X] I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

Because now I'm having trouble getting the stream from RTSP (IPcamera) when using object detection. This causes the displayed frames to be somewhat delayed from reality, which is a delay of 20-30 seconds. What are some ways to make stream processing more efficient?

import cv2
from ultralytics import YOLO
import supervision as sv
import numpy as np
import argparse
from sendtofirebase import firedatabase as fdb

rtsp_url = "rtsp://admin:kasidate01@192.168.xx.xx:554/Streaming/Channels/101"
model_path = "D:\\Private\\Y3Project\\python_project\\Weights\\w2024-04-27\\best.pt"

ZONE_POLYGON = np.array([
    [0.1, 0.1],
    [0.9, 0.1],
    [0.9, 0.9],
    [0.1, 0.9],
    [0.1, 0.1]
])

def parse_arguments() -> argparse.Namespace:
    parser = argparse.ArgumentParser(description="YOLOv8 live")
    parser.add_argument(
        "--webcam-resolution",
        default=[1280, 720],
        nargs=2,
        type=int
    )
    args = parser.parse_args()
    return args

def main():
    args = parse_arguments()
    cap = cv2.VideoCapture(rtsp_url)
    model = YOLO(model_path)
    frame_width = 1920
    frame_height = 1080
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, frame_width)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, frame_height)

    box_annotator = sv.RoundBoxAnnotator(thickness=2)
    label_annotator = sv.LabelAnnotator(
        text_position=sv.Position.TOP_CENTER,
        text_thickness=2,
        text_scale=1
    )

    zone_polygon = (ZONE_POLYGON * np.array(args.webcam_resolution)).astype(int)

    zone = sv.PolygonZone(polygon=zone_polygon, frame_resolution_wh=tuple(args.webcam_resolution))
    zone_annotator = sv.PolygonZoneAnnotator(
        zone=zone,
        color=sv.Color.RED,
        text_color=sv.Color.WHITE,
        thickness=1,
        text_thickness=1,
        text_scale=1
    )

    while True:
        ret, frame = cap.read()
        results = list(model.track(source=frame, stream=True, persist=True))

        if results is not None and len(results) > 0:
            for result in results:
                try:
                    frame = result.orig_img
                    detections = sv.Detections.from_ultralytics(result)

                    if result.boxes.id is not None:
                        detections.tracker_id = result.boxes.id.cpu().numpy().astype(int)

                    labels = [
                        f"#{tracker_id} {model.model.names[class_id]} {confidence:0.2f}"
                        for class_id, confidence, tracker_id
                        in zip(detections.class_id, detections.confidence, detections.tracker_id)
                    ]

                    frame = box_annotator.annotate(
                        scene=frame.copy(),
                        detections=detections
                        )
                    frame = label_annotator.annotate(
                        scene=frame.copy(),
                        detections=detections,
                        labels=labels
                        )

                except TypeError as e:
                    print(f"Error: {e}")
                    continue

            mask = zone.trigger(detections=detections)
            frame = zone_annotator.annotate(scene=frame)
            count_no_helmet = np.count_nonzero((detections.class_id == 0) & (detections.confidence > 0.5) & mask)
            print(f"count_no_helmet : {count_no_helmet}")
            fdb.child("/").set({"count_no_helmet": count_no_helmet})

        frame_resize = cv2.resize(frame,(1280,720))
        cv2.imshow("yolov8", frame_resize)

        if cv2.waitKey(30) == 27:
            break

    cap.release()
    cv2.destroyAllWindows()

if __name__ == "__main__":
    main()

But if compared to the code that just receives images from RTSP alone, there is no delay at all. There might be only 1-3 seconds.

import cv2
rtsp_url = "rtsp://admin:kasidate01@192.168.xx.xx:554/Streaming/Channels/101"
cap = cv2.VideoCapture(rtsp_url)
while True:
    ret, frame = cap.read()
    if not ret:
        print("Can not read frame from camera")
        break

    cv2.imshow("RTSP Feed", frame)

    if cv2.waitKey(15) == 27:
        break

cap.release()
cv2.destroyAllWindows()

Additional

No response

REZIZ-TER commented 4 months ago

https://github.com/ultralytics/ultralytics/assets/67143657/e5141d08-87f7-415e-ad88-4b2c98564cea

glenn-jocher commented 4 months ago

@REZIZ-TER hello!

It looks like your message might be missing a proper link or context for your query. Could you please repost the correct link or provide additional information about your issue?

Thank you! Looking forward to helping you!

REZIZ-TER commented 4 months ago

Hello @glenn-jocher, thanks for your reply. The purpose of my work is to receive images from an IP camera using an RTSP URL and use them for object detection using YOLOv8. The problem I encountered was the image frames displayed by the command. cv2.imshow("yolov8", frame_resize) has a longer than realistic lag of 20-30 seconds. For example, I held up my finger at the time 10:35 minutes 20 seconds It was 10:35 minutes and 50 seconds before the picture of me holding up my finger appeared. or more which when I use this link rtsp://admin:kasidate01@192.168.xx.xx:554/Streaming/Channels/101 Watching the stream via VLC media player has a relatively small lag of 3-5 seconds, which is an acceptable time for me.

shaimaahamam commented 4 months ago

I have the same problem and I don't know why !

glenn-jocher commented 4 months ago

Hello,

It sounds like you're experiencing a streaming delay when using YOLOv8 with RTSP feeds. This delay is typically related to the buffering settings in your video capture pipeline. You can try reducing the buffer size in OpenCV to decrease the latency. Here's an example:

cap = cv2.VideoCapture(rtsp_url)
cap.set(cv2.CAP_PROP_BUFFERSIZE, 2)  # Set a small buffer size

Ensure your system and network are optimized for real-time streaming as well. If the issue persists, please share more details about your setup and any specific configurations you're using. That way, we can help diagnose the problem more effectively! 😊

WojciechowskiMarek commented 4 months ago

I was dealing same problem by using videocapture from cv2. My yolov8 detection sometimes takes long time and almost freeze. In meanwhile buffer is growing and it is not to use solution. For me important is to catch last frame (cleared buffer) and send it to detection (registration plates detection). I used solution with queue presented here: https://stackoverflow.com/questions/54460797/how-to-disable-buffer-in-opencv-camera You can read just frame from stream 😁

glenn-jocher commented 4 months ago

Hello @WojciechowskiMarek,

Thank you for sharing your experience and the solution you found! Using a queue to handle the frames and ensure you're processing the latest one is a great approach to mitigate the buffering issue. This can indeed help in maintaining real-time performance for tasks like registration plate detection. 😊

If anyone else is facing similar issues, implementing a queue to manage the frames can be a very effective solution. Thanks again for contributing to the discussion!

github-actions[bot] commented 3 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

ultralytics / ultralytics