doguilmak / Object-Detection-with-Faster-R-CNN

Apply object detection with Faster R-CNN to classify predetermined objects using objects name and/or to use the likelihood of the object.
6 stars 1 forks source link
faster-rcnn object-detection ssd torchvision

Object Detection with Faster R-CNN

https://www.forecr.io/blogs/ai-algorithms/how-to-run-tensorflow-object-detection-in-real-time-with-raspberry-v2-csi-camera-on-nvidia%C2%AE-jetson%E2%84%A2-nano%E2%84%A2

Picture Source: forecr


Description

Context

Faster R-CNN is a method for object detection that uses region proposal. In this lab, you will use Faster R-CNN pre-trained on the coco dataset. You will learn how to detect several objects by name and to use the likelihood of the object prediction being correct.


Object Detection Models

Types of Object Detection Sliding window techniques are slow. Fortunately, there are two major types of object detection that speed up the process. Region-based object detection breaks up the image into regions and performs a prediction, while Single-Stage object detection uses the entire image.


In the following lab, you will use Faster R-CNN for prediction. You will train an SSD model, even though SSD is considerably faster than other methods, it will still take a long time to train. Therefore we will train most of the model for you, and you will train the model for the last few iterations.


Faster R-CNN

Faster R-CNN uses the more convenient Region Proposal Network instead of costly selective search.

Faster R-CNN can be analyzed in two stages:


Keywords


Statement

Apply object detection with Faster R-CNN to classify predetermined objects using objects name and/or to use the likelihood of the object.


About Faster R-CNN and ResNet-50-FPN

Picture Source: Andra Petrovai

Faster R-CNN is a model that predicts both bounding boxes and class scores for potential objects in the image pre-trained on COCO. Faster R-CNN model with a ResNet-50-FPN backbone from the Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks paper.


Training for Object Detection

Object detection is based on two principles. The first is the learnable parameters in the created rectangle (box), and the second is the size of the created box (coordinate information). While the model is being trained, ground truth and prediction values are evaluated with the difference of squares. Evaluations depend on the size of the ground truth rectangles created through functions. The functions calculate the difference between the ground truth box and the predicted rectangle.

Loss equasion:

$$||box\ - \hat{box}||^2 = (y{min} - \hat{y}{min})^2 + (y{max} - \hat{y}{max})^2 + (x{min} - \hat{x}{min})^2 + (x{max} - \hat{x}{max})^2$$


Image Without Object Detection (Original Image)

Image With Object Detection

Function output:

Label: car

Box coordinates: 433, 377, 723, 460

Probability: 0.9993621706962585


Label: airplane

Box coordinates: 62, 213, 614, 351

Probability: 0.9987056255340576


References


Contact Me

If you have something to say to me please contact me:

  • Twitter: Doguilmak
  • Mail address: doguilmak@gmail.com