This work is still in progress. I worked in it as a part of my course project in Spring 2018. Currently I am tweaking the implemented architectures and also experimenting with new architectures such as Dilated Convolutions and Dilated Dense nets which might result in a better performace
Semantic Segmentation involves understanding the im- age on a pixel-by-pixel level i.e. to assign a class label to every pixel in the image. We experiment with different ar- chitectures to perform segmantic segmentation of images on the PASCAL VOC 2012 [3] dataset. We implement the Fully Convolutional Networks (FCN) by Long et al.[7] as our baseline method for performing se- mantic segmentation. We perform various experiments with the number and position of skip connections and adding dif- ferent layers to aggregate more context information. We then implement an Improved Fully Convolutional Network (IFCN) architecture as suggested in the work of Shuai et al. [8] which introduces a context network that progressively expands the receptive fields of feature maps. In addition, dense skip connections are added so that the context network can be effectively optimized and fuses rich- scale context to make reliable predictions, which has proven to show significant improvements in segmentation on the PASCAL VOC 2012 [3] dataset. We also modify the U-Net architecture for multi-class semantic segmentation with pre-trained weights from the VGG-16 architecture trained on the ImageNet dataset.
This repository is organized as follows:
This folder contains following Models implemented: FCN 32s FCN 16 s FCN 8s IFCN(Improved FCN) U-Net(pre-trained on vgg16) Work is in progress for Resnet Dilated Densenets
models.py--> architecture of all models train.py--> training file(softamx loss used) evluate.py--> uses Mean IOU infernece.py--> called by evaluate.py
this is a comprehensive report detailing all the analysis and discussing my work. Have a look at it if you would like to know in more detail about the project
Contains all the utility folders
these contain all the training and the testing files
The order in which models performed on the PASCAL VOC dataset is as follows: U-Net>IFCN>FCN8s>FCN16s>FCN32s
A sample of result images is shown in the project report. One can clearly make out the difference between the results fo different models