ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.85k stars 16.37k forks source link

Real-time video stream analysis from the monitor screen #11523

Closed krolaper closed 1 year ago

krolaper commented 1 year ago

Search before asking

Question

There is a script implementation example for capturing and displaying the monitor screen.

from vidgear.gears import ScreenGear
import cv2

stream = ScreenGear().start()
while True:
    frame = stream.read()
    if frame is None:
        break

    # if necessary, work with personnel

    cv2.imshow('test', frame)
    key = cv2.waitKey(1) & 0xFF
    if key == ord('q'):
        break
cv2.destroyAllWindows()
stream.stop()

Tell me how you can pass this frame from the loop to detect.py as a source variable? That is analogous to a webcam.

Additional

No response

glenn-jocher commented 1 year ago

@krolaper hello,

Thank you for reaching out. To use the script implementation example for capturing and displaying the monitor screen as a source variable in detect.py, you can modify the script to write the frames to a temporary file, and then pass the file as the source argument in detect.py.

Here is an example implementation:

from vidgear.gears import ScreenGear
import cv2
import tempfile
import os

stream = ScreenGear().start()
temp = tempfile.NamedTemporaryFile(suffix='.mp4')
output_file = temp.name
while True:
    frame = stream.read()
    if frame is None:
        break

    # if necessary, work with personnel

    cv2.imshow('test', frame)
    temp.write(frame)

    key = cv2.waitKey(1) & 0xFF
    if key == ord('q'):
        break
cv2.destroyAllWindows()
stream.stop()

os.system(f"python detect.py --weights yolov5s.pt --source '{output_file}'")

This will pass the frames from the video file to detect.py as a source variable.

Let me know if you have any further questions or issues.

Best regards.

krolaper commented 1 year ago

@glenn-jocher I understand this, but I asked a specific question in which I indicated how to process frames in real time, and not write to a file and analyze it. I understand it should be handled like a webcam. But when analyzing a webcam, we specify the id of the camera in the variable, and this is written in YOLO scripts. But how can this be applied to the captured screen in real time?

glenn-jocher commented 1 year ago

@krolaper hello,

Thank you for explaining your question in more detail. You can use the StreamGear class from the vidgear library along with OpenCV to capture frames from the screen and use it as input to detect.py. Here is an implementation that you can use:

from vidgear.gears import StreamGear
import cv2

# open stream
options = {'-threads': 1, '-fflags': 'nobuffer'}
stream = StreamGear(source='screen://', logging=True, **options).start()

while True:
    # read frames from stream
    frame = stream.read()

    # if stream.read() returns None, then the stream has stopped
    if frame is None:
        break

    # You can also modify the frame here if needed

    # display the resulting frame
    cv2.imshow('frame', frame)

    # Press 'q' to quit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cv2.destroyAllWindows()

# stop stream
stream.stop()

You can modify this implementation to include detect.py to process the frames. Let me know if you have any further questions or need any help.

Thanks!

krolaper commented 1 year ago

@glenn-jocher if you specify source='screen://', then source=stream.read() will further define it as a screenshot, and with dataloader it will follow the path LoadScreenshots and will process just frames without defining them as a video stream . do not attach yourself to the code inside while True, it is just an example of how to implement a screen capture without parsing. I'm also interested in how it can be implemented into the analysis of detect.py, like a YouTube stream or a webcam. Is it even possible without an appointment? Is it possible somehow to update frames without while, why not constantly restart almost the entire detect.py script?

glenn-jocher commented 1 year ago

Hello @krolaper,

Yes, you are correct. If you use source='screen://', then using source=stream.read() will only define it as a screenshot. To process frames from the screen as a video stream with dataloader, you can use StreamGear class from the vidgear library along with OpenCV to capture frames in real-time, similar to using a webcam.

Regarding your question on how to implement this with detect.py, you can pass the frames as the source argument in detect.py similar to what is done for video or webcam streams.

Here is an example implementation that uses StreamGear and OpenCV to capture frames from the screen in real-time and process them using detect.py:

from vidgear.gears import StreamGear
import cv2
import subprocess
import os

# open stream
options = {'-threads': 1, '-fflags': 'nobuffer'}
stream = StreamGear(source='screen://', logging=True, **options).start()

while True:
    # read frames from stream
    frame = stream.read()

    # if stream.read() returns None, then the stream has stopped
    if frame is None:
        break

    # You can modify the frame here if needed

    # run detection on the frame using detect.py
    with open(os.devnull, "w") as f:
        subprocess.call(["python", "detect.py", "--weights", "yolov5s.pt", "--source", "-", "--img-size", "640"], stdin=subprocess.PIPE, stdout=f, stderr=f, shell=False)

    # display the resulting frame
    cv2.imshow('frame', frame)

    # Press 'q' to quit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cv2.destroyAllWindows()

# stop stream
stream.stop()

This implementation continuously captures frames from the screen, executes detect.py on each frame, and displays the resulting frames. You can modify the code further to fit your use case.

Let me know if you have any further questions or if I can assist with anything else.

Best, Glenn.

krolaper commented 1 year ago

@glenn-jocher I just figured out what screenshot is in YOLO. After all, it works in the same way as I described initially, only in a different way. Why didn't you tell me right away? :)) And then I get duplicated desired. I did view_img = check_imshow(warn=True) to see the result. Appeared appeared, it is impossible to complete the analysis process by closing the cv2 window, it simply does not respond to pressing the end key. How to end the process, not through the terminal, but by pressing a key or simply closing the window with the mouse? Provided the code in the view_img condition.

            if view_img:
                if platform.system() == 'Linux' and p not in windows:
                    windows.append(p)
                    cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # allow window resize (Linux)
                    cv2.resizeWindow(str(p), 800, 600)
                cv2.imshow(str(p), im0)
                key = cv2.waitKey(1)  # 1 millisecond (60)
                if key == 32: # Space
                    cv2.waitKey(0) # delay
                elif key == 27: # ESC
                    break
                if cv2.getWindowProperty(str(p), cv2.WND_PROP_VISIBLE) < 1: # Closed
                    break
glenn-jocher commented 1 year ago

Hello @krolaper,

I'm glad that you were able to understand how screenshot works in YOLO.

Regarding your question on how to end the view_img process, you can press the ESC key to exit the window or close the window with the mouse. If the ESC key is not working, you can try running the code in the terminal and use Ctrl + C to exit the process.

I hope this helps. Let me know if you have any further questions or if I can assist with anything else.

Best, Glenn

krolaper commented 1 year ago

@glenn-jocher as I wrote above, it does not respond to pressing escape, and when the window is closed with the mouse, due to the fact that there is no break to exit the outer loop, it just exits the inner loop and restarts the window with a new frame. I solved this problem by adding lines to the window close condition, after break, cv2.destroyAllWindows and windows = [], and after the view condition I added the condition if len(windows)==0: break. This is a note to you, since this problem is present in YOLO.

glenn-jocher commented 1 year ago

@krolaper hello,

Thank you for sharing this information. I appreciate you providing a solution to this issue where the YOLOv5 script is not able to exit when ESC key is pressed or the window is closed. I will take note of this and share it with the development team to see if this can be addressed in future updates.

Best, Glenn Jocher

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐