Accenture / AIR

A deep learning object detector framework written in Python for supporting Land Search and Rescue Missions.
Apache License 2.0
30 stars 6 forks source link
air-detector computer-vision deep-learning keras mob-algorithm object-detection retinanet sar-apd tensorflow

AIR-logo

AIR: Aerial Inspection RetinaNet for supporting Land Search and Rescue Missions

DOI Docker Pulls

AIR is a deep learning based object detection solution to automate the aerial drone footage inspection task frequently carried out during search and rescue (SAR) operations with drone units. It provides a fast, convenient and reliable way to augment aerial, high-resolution image inspection for clues about human presence by highlighting relevant image regions with bounding boxes, as done in the image below. With the assistance of AIR, SAR missions with aerial drone searches can likely be carried out much faster before, and with considerably higher success rate.

This code repository is based on the master's thesis work by Pasi Pyrrö ORCID iD icon from Aalto University, School of Science which was funded by Accenture.

AIR-example

Key features

Results on HERIDAL test set

In the table below are listed the results of the AIR detector and other notable state-of-the-art methods on the HERIDAL test set. Note that the FPS metric is a rough estimate based on the reported inference speed and hardware setup by the original authors, and should be taken as mostly directional. We did not test any of the other methods ourselves.

Method Precision Recall AP FPS
Mean shift segmentation method [1] 18.7 74.7 - -
Saliency guided VGG16 [2] 34.8 88.9 - -
Faster R-CNN [3] 67.3 88.3 86.1 1
Two-stage multimodel CNN [4] 68.9 94.7 - 0.1
SSD [4] 4.3 94.4 - -
AIR with NMS (ours) 90.1 86.1 84.6 1

It turns out AIR achieves state-of-the-art results in precison and inference speed while having comparable recall to the strongest competitors!

You can check out the full details of AIR evaluation in this Wandb report.

Bounding Box Aggregation options

NMS-vs-MOB

AIR implements both NMS and MOB algorithms for bounding box prediction postprocessing. The image above shows the main difference: MOB (c) merges bounding boxes (a) instead of eliminating them like NMS (b). Thus, choosing MOB can produce visually more pleasing predictions. Moreover, MOB is less sensitive to the choice of condifence score threshold, making it more robust under unseen data. AIR also comes with a custom SAR-APD evaluation scheme that truthfully ranks MOB-equipped object detector performance (as standard object detection metrics, such as VOC2012 AP, do not like MOB very much).

Should you choose MOB postprocessing and SAR-APD evaluation, you should see similar evaluation results (e.g., after running bash evaluate.sh)

NMS-vs-MOB

All metrics increased by over 4 percentage points from NMS & VOC2012 results with no loss of prediction visual quality, neat!

Hardware Requirements

Software Requirements

If using containers:

If using native installation:

Quick install instructions using Docker

Quick install instructions using Singularity

Quick native install instructions

Download data and the trained model

Quick demos

Once everything is set up (installation and asset downloads), you might wanna try out these cool and simple demos to get the hang of using the AIR detector.

Running inference on HERIDAL test image folder

Evaluating AIR performance on HERIDAL test set (can be slow on CPU)

SAR-APD video detection with image output

SAR-APD video detection with video output

Wandb support

Project folder structure

General toubleshooting

Acknowledgements

How to cite

@MastersThesis{pyrro2021air, 
    title={{AIR:} {Aerial} Inspection RetinaNet for Land Search and Rescue Missions}, 
    author={Pyrr{\"o}, Pasi and Naseri, Hassan and Jung, Alexander},
    school={Aalto University, School of Science},
    year={2021}
}

Licenses

References

[1] TURIC, H., DUJMIC, H., AND PAPIC, V. Two-stage segmentation of aerial images for search and rescue. Information Technology and Control 39, 2 (2010).

[2] BOŽIC-ŠTULIC, D., MARUŠIC, Ž., AND GOTOVAC, S. Deep learning approach in aerial imagery for supporting land search and rescue missions. International Journal of Computer Vision 127, 9 (2019), 1256–1278.

[3] MARUŠIC, Ž., BOŽIC-ŠTULIC, D., GOTOVAC, S., AND MARUŠIC, T. Region proposal approach for human detection on aerial imagery. In 2018 3rd International Conference on Smart and Sustainable Technologies (SpliTech) (2018), IEEE, pp. 1–6.

[4] VASIC, M. K., AND PAPIC, V. Multimodel deep learning for person detection in aerial images. Electronics 9, 9 (2020), 1459.