Detection of roadways within satellite data
Data provided: 45 training examples, 10 test images
Derive a mask depicting roadways from each 3200 x 4800 satellite image tile.
This is an image segmentation problem requiring pixel-wise binary classification of the input image resulting in a two-class output image. An extension of a fully connected convolutional network (FCN) was selected to solve this problem as it is suitable for use with a small number of training examples. A total of 8 false positives were removed from the dataset during preprocessing.
Solution Architecture:
U-net provides good segementation capability using a symmetrical downsampling and upsampling network of layers which is more efficient than an FCN. A pre-existing open-source implementation (tf_unet) was utilised for this purpose.
Optimisation:
Training occured over 64 iterations across 50 epochs. The network was trained using an RTX2070 GPU in approximately 90 minutes resulting in an average accuracy of 98%.
Model training Accuracy:
Mask creation from the training set:
Creating a mask from the test set using the pre-trained model (no ground truth mask):
Improvements:
It would be desirable to avoid downsampling the training images in order to capture more details during learning, as well as to output a mask at the same resolution as the training images.
In order to increase the power of the network, it would be beneficial to train on multiple GPUs. Horovod - a C++ library developed by Uber for distributed tensorflow operations (https://github.com/uber/horovod) would be worth investigating to this end.
An implementation of U-net with iHorovod integration built-in would be a logical choice for testing alongside the proposed solution. https://github.com/ankurhanda/tf-unet