This is the PyTorch implementation of our paper:
Cascaded Zoom-in Detector for High Resolution Aerial Images
Akhil Meethal, Eric Granger, Marco Pedersoli
[arXiv] [CVPRw]
Accepted at: CVPRw 2023 (EarthVison Workshop oragnized by IEEE GRSS)
The method proposed in this paper can be easily integrated to the detector of your choice to improve its small object detection performance. In this repo, we demonstrated it with the two-stage Faster RCNN detector and the one-stage anchor free FCOS detector.
# create conda env
conda create -n detectron2 python=3.6
# activate the enviorment
conda activate detectron2
# install PyTorch >=1.5 with GPU
conda install pytorch torchvision -c pytorch
Follow the INSTALL.md to install Detectron2.
Follow the instructions on VisDrone page
croptrain/
└── datasets/
└── VisDrone/
├── train/
├── val/
├── annotations_VisDrone_train.json
└── annotations_VisDrone_val.json
The original annotations provided with the VisDrone dataset is in PASCAL VOC format. I used this code to convert it to COCO style annotation: VOC2COCO.
Update: I am sharing the json files I generated for the VisDrone dataset via google drive below.
a) annotations_VisDrone_train.json
b) annotations_VisDrone_val.json
Please follow the instructions on DOTA page. Organize it the same way as above. You can also download the json files for train and validation set below:
a) annotations_DOTA_train.json
python train_net.py \
--num-gpus 1 \
--config-file configs/Base-RCNN-FPN.yaml \
OUTPUT_DIR outputs_FPN_VisDrone
python train_net.py \
--num-gpus 1 \
--config-file configs/Dota-Base-RCNN-FPN.yaml \
OUTPUT_DIR outputs_FPN_DOTA
python train_net.py \
--num-gpus 1 \
--config-file configs/RCNN-FPN-CROP.yaml \
OUTPUT_DIR outputs_FPN_CROP_VisDrone
python train_net.py \
--num-gpus 1 \
--config-file configs/Dota-RCNN-FPN-CROP.yaml \
OUTPUT_DIR outputs_FPN_CROP_DOTA
python train_net.py \
--resume \
--num-gpus 1 \
--config-file configs/Base-RCNN-FPN.yaml \
OUTPUT_DIR outputs_FPN_VisDrone
python train_net.py \
--eval-only \
--num-gpus 1 \
--config-file configs/Base-RCNN-FPN.yaml \
MODEL.WEIGHTS <your weight>.pth
If you use Cascaded Zoom-in Detector in your research or wish to refer to the results published in the paper, please use the following BibTeX entry.
@inproceedings{meethal2023czdetector,
title={Cascaded Zoom-in Detector for High Resolution Aerial Images},
author={Meethal, Akhil and Granger, Eric and Pedersoli, Marco},
booktitle={CVPRw},
year={2023},
}
Also, if you use Detectron2 in your research, please use the following BibTeX entry.
@misc{wu2019detectron2,
author = {Yuxin Wu and Alexander Kirillov and Francisco Massa and
Wan-Yen Lo and Ross Girshick},
title = {Detectron2},
howpublished = {\url{https://github.com/facebookresearch/detectron2}},
year = {2019}
}
This project is licensed under MIT License, as found in the LICENSE file.