This repository implements Faster R-CNN with training, inference and map evaluation in PyTorch. The aim was to create a simple implementation based on PyTorch faster r-cnn codebase and to get rid of all the abstractions and make the implementation easy to understand.
The implementation caters to batch size of 1 only and uses roi pooling on single scale feature map. The repo is meant to train faster r-cnn on voc dataset. Specifically I trained on VOC 2007 dataset.
<img alt="Faster R-CNN Explanation" src="https://github.com/explainingai-code/FasterRCNN-PyTorch/assets/144267687/4da49766-d216-4c5a-b619-44ab269e0a7b" width="300">
<img alt="Faster R-CNN Implementation" src="https://github.com/explainingai-code/FasterRCNN-PyTorch/assets/144267687/fc24c80f-4ddf-45e7-ad1d-9989bc978f10" width="300">
<img alt="Faster R-CNN Implementation" src="https://github.com/explainingai-code/FasterRCNN-PyTorch/assets/144267687/d6d9a889-abbb-42c3-92df-635ff4457bb4" width="300">
Ground Truth(Left) | Prediction(right)
For setting up the VOC 2007 dataset:
VOC2007
folderVOC2007-test
folderFasterRCNN-Pytorch
-> VOC2007
-> JPEGImages
-> Annotations
-> VOC2007-test
-> JPEGImages
-> Annotations
-> tools
-> train.py
-> infer.py
-> train_torchvision_frcnn.py
-> infer_torchvision_frcnn.py
-> config
-> voc.yaml
-> model
-> faster_rcnn.py
-> dataset
-> voc.py
config/voc.yaml
) and update the dataset_params and change the task_name as well as ckpt_name based on your own dataset.dataset/voc.py
) class and make following changes:
im_info : {
'filename' : <image path>
'detections' :
[
'label': <integer class label for this detection>, # assuming the same order as classes list present above, with background as zero index.
'bbox' : list of x1,y1,x2,y2 for the bboxes.
]
}
__getitem__
returns the following:
im_tensor(C x H x W) ,
target{
'bboxes': Number of Gts x 4,
'labels': Number of Gts,
}
file_path(just used for debugging)
This repo has some differences from actual Faster RCNN paper.
fc_inner_dim
in configbackbone_out_channels
in configroi_low_bg_iou
to say 0.1(this will ignore proposals with < 0.1 iou)acc_steps
in config to > 1git clone https://github.com/explainingai-code/FasterRCNN-PyTorch.git
cd FasterRCNN-PyTorch
pip install -r requirements.txt
python -m tools.train
for training Faster R-CNN on voc datasetpython -m tools.infer --evaluate False --infer_samples True
for generating inference predictionspython -m tools.infer --evaluate True --infer_samples False
for evaluating on test datasetpython -m tools.train_torchvision_frcnn
for training using torchvision pretrained Faster R-CNN class on voc dataset
python -m tools.infer_torchvision_frcnn
for inference and testing purposes. Pass the desired configuration file as the config argument.
config/voc.yaml
- Allows you to play with different components of faster r-cnn on voc dataset Outputs will be saved according to the configuration present in yaml files.
For every run a folder of task_name
key in config will be created
During training of FasterRCNN the following output will be saved
task_name
directoryDuring inference the following output will be saved
task_name/samples/*.png
@article{DBLP:journals/corr/RenHG015,
author = {Shaoqing Ren and
Kaiming He and
Ross B. Girshick and
Jian Sun},
title = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal
Networks},
journal = {CoRR},
volume = {abs/1506.01497},
year = {2015},
url = {http://arxiv.org/abs/1506.01497},
eprinttype = {arXiv},
eprint = {1506.01497},
timestamp = {Mon, 13 Aug 2018 16:46:02 +0200},
biburl = {https://dblp.org/rec/journals/corr/RenHG015.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}