Object Detection with Faster R-CNN
Picture Source: forecr
Faster R-CNN is a method for object detection that uses region proposal. In this lab, you will use Faster R-CNN pre-trained on the coco dataset. You will learn how to detect several objects by name and to use the likelihood of the object prediction being correct.
Types of Object Detection Sliding window techniques are slow. Fortunately, there are two major types of object detection that speed up the process. Region-based object detection breaks up the image into regions and performs a prediction, while Single-Stage object detection uses the entire image.
In the following lab, you will use Faster R-CNN for prediction. You will train an SSD model, even though SSD is considerably faster than other methods, it will still take a long time to train. Therefore we will train most of the model for you, and you will train the model for the last few iterations.
Faster R-CNN uses the more convenient Region Proposal Network instead of costly selective search.
Faster R-CNN can be analyzed in two stages:
Apply object detection with Faster R-CNN to classify predetermined objects using objects name and/or to use the likelihood of the object.
Picture Source: Andra Petrovai
Faster R-CNN is a model that predicts both bounding boxes and class scores for potential objects in the image pre-trained on COCO. Faster R-CNN model with a ResNet-50-FPN backbone from the Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks paper.
Object detection is based on two principles. The first is the learnable parameters in the created rectangle (box), and the second is the size of the created box (coordinate information). While the model is being trained, ground truth and prediction values are evaluated with the difference of squares. Evaluations depend on the size of the ground truth rectangles created through functions. The functions calculate the difference between the ground truth box and the predicted rectangle.
Loss equasion:
$$||box\ - \hat{box}||^2 = (y{min} - \hat{y}{min})^2 + (y{max} - \hat{y}{max})^2 + (x{min} - \hat{x}{min})^2 + (x{max} - \hat{x}{max})^2$$
Function output:
Label: car
Box coordinates: 433, 377, 723, 460
Probability: 0.9993621706962585
Label: airplane
Box coordinates: 62, 213, 614, 351
Probability: 0.9987056255340576
If you have something to say to me please contact me: