This Python script leverages OpenCV and a pre-trained SSD MobileNet v3 deep learning model for real-time object detection in video streams. It utilizes the COCO (Common Objects in Context) dataset for object class recognition.
pip install opencv-python
pip install numpy
(usually included with OpenCV)coco.names
file: Contains class labels for the COCO dataset (download from a reliable source)ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt
file: Model configuration file (download from a reliable source)frozen_inference_graph.pb
file: Model weights file (download from a reliable source)Download necessary files:
coco.names
, ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt
, and frozen_inference_graph.pb
files from a trusted source (ensure compatibility with your OpenCV version). Place them in the same directory as your Python script.Run the script:
python object_detection.py
Imports:
cv2
: Imports the OpenCV library for computer vision tasks.Threshold and Video Capture:
thres
: Sets the confidence threshold for object detection (adjust as needed).cap
: Initializes a video capture object (VideoCapture(1)
) to access your webcam (or a different video source by providing its index or path).cap.set()
: Sets video capture properties:
3
: Width (adjust for desired resolution)4
: Height (adjust for desired resolution)10
: Brightness (adjust for lighting conditions)Load Class Names:
classNames
: Creates an empty list to store object class names.classFile
: Path to the coco.names
file.Load Model Configuration and Weights:
configPath
: Path to the ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt
file.weightsPath
: Path to the frozen_inference_graph.pb
file.net
: Creates a detection model object using cv2.dnn_DetectionModel()
.Main Loop:
while True
: Continuously captures frames from the video stream.success, img
: Reads a frame and checks for success. Exits if unsuccessful.classIds
, confs
, bbox
: Performs object detection using the model on the current frame.
classIds
: List of detected object class IDs.confs
: List of corresponding confidence scores (0-1).bbox
: List of bounding boxes (coordinates) for detected objects.Draw Bounding Boxes and Labels:
len(classIds) != 0
).zip
:
box
: Current bounding box coordinates.classId
: Current object class ID (minus 1 for indexing).confidence
: Current object confidence score.cv2.rectangle()
.cv2.putText()
.Display and Exit:
cv2.imshow()
: Displays the processed frame with bounding boxes and labels in a window titled "Output".cv2.waitKey(1)
: Waits for a key press for 1 millisecond.Cleanup:
cap.release()
cv2.destroyAllWindows()
This project is licensed under the MIT License. thannk