Object Detection with Faster R-CNN

https://www.forecr.io/blogs/ai-algorithms/how-to-run-tensorflow-object-detection-in-real-time-with-raspberry-v2-csi-camera-on-nvidia%C2%AE-jetson%E2%84%A2-nano%E2%84%A2

Picture Source: forecr

Description

Context

Faster R-CNN is a method for object detection that uses region proposal. In this lab, you will use Faster R-CNN pre-trained on the coco dataset. You will learn how to detect several objects by name and to use the likelihood of the object prediction being correct.

Object Detection Models

Types of Object Detection Sliding window techniques are slow. Fortunately, there are two major types of object detection that speed up the process. Region-based object detection breaks up the image into regions and performs a prediction, while Single-Stage object detection uses the entire image.

Region-Based Convolutional Neural Network (R-CNN) are usually more accurate but slower; they include R-CNN, Fast R-CNN and Faster R-CNN.
Single-Stage methods are faster but less accurate and include techniques like Single Shot Detection (SSD) and You Only Look Once (YOLO).

In the following lab, you will use Faster R-CNN for prediction. You will train an SSD model, even though SSD is considerably faster than other methods, it will still take a long time to train. Therefore we will train most of the model for you, and you will train the model for the last few iterations.

Faster R-CNN

Faster R-CNN uses the more convenient Region Proposal Network instead of costly selective search.

Faster R-CNN can be analyzed in two stages:

Region Proposal Network (RPN): The first stage, RPN, is a deep convolutional neural network for suggesting regions. RPN takes any size of input as input and generates a rectangular proposal that may belong to a set of objects based on the object score. It makes this suggestion by shifting a small mesh over the feature map generated by the convolutional layer.
Fast R-CNN: These calculations produced by RPN are inserted into the Fast R-CNN architecture and the class of the object is estimated with a classifier and the bounding box is estimated with a regressor.

Keywords

Faster R-CNN
Object Detection
ResNet-50-FPN
COCO

Statement

Apply object detection with Faster R-CNN to classify predetermined objects using objects name and/or to use the likelihood of the object.

About Faster R-CNN and ResNet-50-FPN

Picture Source: Andra Petrovai

Faster R-CNN is a model that predicts both bounding boxes and class scores for potential objects in the image pre-trained on COCO. Faster R-CNN model with a ResNet-50-FPN backbone from the Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks paper.

Training for Object Detection

Object detection is based on two principles. The first is the learnable parameters in the created rectangle (box), and the second is the size of the created box (coordinate information). While the model is being trained, ground truth and prediction values are evaluated with the difference of squares. Evaluations depend on the size of the ground truth rectangles created through functions. The functions calculate the difference between the ground truth box and the predicted rectangle.

Loss equasion:

$$||box\ - \hat{box}||^2 = (y{min} - \hat{y}{min})^2 + (y{max} - \hat{y}{max})^2 + (x{min} - \hat{x}{min})^2 + (x{max} - \hat{x}{max})^2$$

Image Without Object Detection (Original Image)

Image With Object Detection

Function output:

Label: car

Box coordinates: 433, 377, 723, 460

Probability: 0.9993621706962585

Label: airplane

Box coordinates: 62, 213, 614, 351

Probability: 0.9987056255340576

References

IBM
Joseph Santarcangelo
PyTorch
Computer Vision and Pattern Recognition - Cornell University

Pillow Docs
OpenCV
forecr
Gonzalez, Rafael C., and Richard E. Woods. "Digital image processing." (2017).

Contact Me

If you have something to say to me please contact me:

Twitter: Doguilmak
Mail address: doguilmak@gmail.com

doguilmak / Object-Detection-with-Faster-R-CNN

readme