Object Detection with OpenCV and SSD MobileNet v3

Description

Screenshot 2024-06-19 121308 Screenshot 2024-06-19 121106

This Python script leverages OpenCV and a pre-trained SSD MobileNet v3 deep learning model for real-time object detection in video streams. It utilizes the COCO (Common Objects in Context) dataset for object class recognition.

Requirements

Python 3.x (Download)
OpenCV library: Install via pip install opencv-python
NumPy library: Install via pip install numpy (usually included with OpenCV)
coco.names file: Contains class labels for the COCO dataset (download from a reliable source)
ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt file: Model configuration file (download from a reliable source)
frozen_inference_graph.pb file: Model weights file (download from a reliable source)

Instructions

Download necessary files:
- Obtain the coco.names, ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt, and frozen_inference_graph.pb files from a trusted source (ensure compatibility with your OpenCV version). Place them in the same directory as your Python script.
Run the script:
- Open a terminal or command prompt, navigate to the directory containing your script and files.
- Execute the script using:
```
python object_detection.py
```

Code Breakdown

Imports:
- cv2: Imports the OpenCV library for computer vision tasks.
Threshold and Video Capture:
- thres: Sets the confidence threshold for object detection (adjust as needed).
- cap: Initializes a video capture object (VideoCapture(1)) to access your webcam (or a different video source by providing its index or path).
- cap.set(): Sets video capture properties:
  - 3: Width (adjust for desired resolution)
  - 4: Height (adjust for desired resolution)
  - 10: Brightness (adjust for lighting conditions)
Load Class Names:
- classNames: Creates an empty list to store object class names.
- classFile: Path to the coco.names file.
- Loads class names from the file and splits them into a list.
Load Model Configuration and Weights:
- configPath: Path to the ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt file.
- weightsPath: Path to the frozen_inference_graph.pb file.
- net: Creates a detection model object using cv2.dnn_DetectionModel().
- Sets model input size, scale, mean, and color channel swapping parameters for compatibility with the model.
Main Loop:
- while True: Continuously captures frames from the video stream.
- success, img: Reads a frame and checks for success. Exits if unsuccessful.
- classIds, confs, bbox: Performs object detection using the model on the current frame.
  - classIds: List of detected object class IDs.
  - confs: List of corresponding confidence scores (0-1).
  - bbox: List of bounding boxes (coordinates) for detected objects.
- Prints detected object IDs and bounding boxes for debugging (optional).
Draw Bounding Boxes and Labels:
- Checks if any objects were detected (len(classIds) != 0).
- Iterates through detected objects using zip:
  - box: Current bounding box coordinates.
  - classId: Current object class ID (minus 1 for indexing).
  - confidence: Current object confidence score.
- Draws a green rectangle around the detected object using cv2.rectangle().
- Displays the corresponding class name (uppercase) and confidence score (rounded to two decimal places) using cv2.putText().
Display and Exit:
- cv2.imshow(): Displays the processed frame with bounding boxes and labels in a window titled "Output".
- cv2.waitKey(1): Waits for a key press for 1 millisecond.
- Exits the loop and releases resources if the 'q' key is pressed.
Cleanup:
- Releases the video capture object and closes all OpenCV windows.
```
cap.release()
cv2.destroyAllWindows()
```

License

This project is licensed under the MIT License. thannk