ckyrkou / EmergencyNet

A small deep learning model based on atrous convolutional feature fusion for the application of emergency response.
20 stars 8 forks source link

EmergencyNet

Overview

There is a need to design specialized networks that are inherently computationally efficient to enable there use in resource contraint devices such as UAVs. The design space can be explored by focusing on the layer configurations, type and connectivity. An architecture is poropsed that allows for flexible aggregation of the multi-scale contextual information while keeping the same resolution and reduced number of parameters. It is based on the Atrous Convolutional Feature Fusion (ACFF) block. computes multiple such atrous convolutional features for the same input map across different dilation rates. Each atrous convolution is factored into depth-wise convolution that performs lightweight filtering by applying a single convolutional kernel per input channel to reduce the computational complexity. An essential part of optimized CNNs is reducing not only the spatial size of feature maps but also the channel dimensions. Hence, prior to the atrous convolutions the input feature map channels are halved. This makes it possible to have multiple branches for atrous convolution without significantly impacting the performance. The depth reduction factor is a hyperparameter that can be further tuned depending on the requirements. The atrous convolutional features at different dilation rates need to be combined together to allow the unit to learn from representations from a large effective receptive field. The fusion mechanism is then followed by 1x1 convolutions and activation that non-linearly combine channel features together and projects them into a higher dimensional space.

The ACFF macro block is used as a starting point to build a deep neural network that is characterized by lowcomputational complexity and is suitable for embedded platforms.

Code

A minimal code for training and ACFF network is provided within the code folder with additional supporting code.

Download and place your dataset and put it in the data folder. Put all images for each class in one folder. Training and Validation splits are handled through a keras generator. If you have already split the two then you need to change to write custom generators.

./
└───data
│   │
│   └───AIDER
│       │   └───collapsed_building
│   │   │    |    collapsed_building (1).jpg
│   │   │    |    collapsed_building (2).jpg
.
.
│       │   └───fire
│   │   │    |    fire (1).jpg
│   │   │    |    fire (2).jpg
.
.
│       │   └───flooded_areas
│   │   │    |    flooded_areas (1).jpg
│   │   │    |    flooded_areas (2).jpg
.
.
│       │   └───normal
│   │   │    |    normal (1).jpg
│   │   │    |    normal (2).jpg
.
.
│       │   └───traffic_incident
│   │   │    |    traffic_incident (1).jpg
│   │   │    |    traffic_incident (2).jpg

Outputs are the full model file, model weights, accuracy and loss curves. These are stored within the results folder.

Models

Three models are provided:

-- Load the models using the load_model function of keras and read the images with opencv's imread function (BGR format). -- Normalize the images using the formula: x = (x / 127.5) - 1

Use the following information to decode the model outputs: Class Network Output Value
collapsed_building 0
Fire 1
Flood 2
Normal/None 3
Traffic Incident 4

Prerequisites

These models where trained using Keras v2.2 and TF v1.8.0. However, the training pipeline is updated to TF v2. See requirements.txt for more details.

Example Outputs

The network can correctly classify some indicative examples from real world events such as

The network can also be ported on andoid and itegraded with UAV applications to process the video feed locally.

Citation Information

Please cite the following paper if you find this is useful for your work:

For more please visit: https://www.christoskyrkou.com