epfl-cs358 / 2024sp-robopong

2 stars 0 forks source link

[CV] Paddle tracking #7

Closed Amene-Gafsi closed 5 months ago

Amene-Gafsi commented 5 months ago

Develop and implement an efficient real-time paddle tracking algorithm.

AndrewYatzkan commented 5 months ago

FYI I would look into using ArUco markers like I did in #8. There may be more efficient options but worth considering

Amene-Gafsi commented 5 months ago

In order to track the real-time coordinates of a paddle, I employed a color-based tracking system using the OpenCV library and Python. The algorithm identifies and tracks the center of the largest red-colored rectangle in each frame, storing the coordinates in a data structure.

Algorithm Overview:

  1. Video Capture: The algorithm starts by capturing video frames from a webcam or video file. If no video path is provided, it defaults to the webcam.
  2. Frame Processing: Each frame is resized to a consistent width to standardize the input and then blurred using a Gaussian blur. The Gaussian blur is crucial as it reduces high-frequency noise, smoothing out the image. This step enhances the detection process by allowing us to focus on significant structural objects—namely, the rectangle—without being distracted by minor imperfections in the image.
  3. Color Segmentation: I apply a color filter to isolate the red hues representing the paddle. This is done by converting the frame to the HSV color space and creating a binary mask where only the pixels falling within the predefined red color range are white, and all others are black.
  4. Morphological Operations: To further clean the image, the mask undergoes erosion and dilation. Erosion removes small white noise and separates objects connected by thin lines, while dilation restores object size and improves the object's visibility.
  5. Contour Detection: The algorithm then finds contours in the mask. A contour is a curve joining all continuous points along the boundary of a white object in the mask. The largest contour is assumed to be the paddle.
  6. Tracking the paddle: Using the bounding rectangle method, the largest contour's dimensions are determined. The bounding rectangle provides the x and y coordinates, width, and height, which are used to compute the center of the rectangle.
  7. Data Storage: The coordinates of the paddle's center are stored in a deque, which is a data structure similar to a list but with faster append and pop operations at both ends. This feature makes it highly suitable for real-time tracking where the efficiency of data insertion and retrieval is paramount.

Adaptability to Real-World Conditions: To adapt this algorithm for a real-world application in tracking the paddle in the pong game, the only adjustments that might be required are the paddle width and color range.

Code :

from collections import deque
from imutils.video import VideoStream
import numpy as np
import argparse
import cv2
import imutils
import time

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", help="path to the (optional) video file")
ap.add_argument("-b", "--buffer", type=int, default=64, help="max buffer size")
args = vars(ap.parse_args())

# define the lower and upper boundaries of the "blue" rectangle in the HSV color space
redLower = (0, 120, 70)
redUpper = (10, 255, 255)

pts = deque(maxlen=args["buffer"])

# if a video path was not supplied, grab the reference to the webcam
if not args.get("video", False):
    vs = VideoStream(src=0).start()
else:
    vs = cv2.VideoCapture(args["video"])

# allow the camera or video file to warm up
time.sleep(2.0)

# keep looping
while True:
    # grab the current frame
    frame = vs.read()
    frame = frame[1] if args.get("video", False) else frame
    if frame is None:
        break

    # resize the frame, blur it, and convert it to the HSV color space
    frame = imutils.resize(frame, width=600)
    blurred = cv2.GaussianBlur(frame, (11, 11), 0)
    hsv = cv2.cvtColor(blurred, cv2.COLOR_BGR2HSV)

    # construct a mask for the color "blue", then perform dilations and erosions
    mask = cv2.inRange(hsv, redLower, redUpper)
    mask = cv2.erode(mask, None, iterations=2)
    mask = cv2.dilate(mask, None, iterations=2)

    # find contours in the mask and initialize the center of the rectangle
    cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = imutils.grab_contours(cnts)
    center = None

    if len(cnts) > 0:
        c = max(cnts, key=cv2.contourArea)
        x, y, w, h = cv2.boundingRect(c)
        center = (int(x + w / 2), int(y + h / 2))

        # draw the rectangle on the frame
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv2.circle(frame, center, 5, (0, 0, 255), -1)

    # update the points queue
    pts.appendleft(center)

    # show the frame to our screen
    cv2.imshow("Frame", frame)
    key = cv2.waitKey(1) & 0xFF

    # if the 'q' key is pressed, stop the loop
    if key == ord("q"):
        break

# cleanup the camera and close any open windows
if not args.get("video", False):
    vs.stop()
else:
    vs.release()
cv2.destroyAllWindows()