This repository contains our Torch7 implementation of the network developed by us at e-Lab. You can go to our blogpost or read the article LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation for further details.
Currently the network can be trained on two datasets:
Datasets | Input Resolution | # of classes |
---|---|---|
CamVid (cv) | 768x576 | 11 |
Cityscapes (cs) | 1024x512 | 19 |
To download both datasets, follow the link provided above.
Both the datasets are first of all resized by the training script and if you want then you can cache this resized data using --cachepath
option.
In case of CamVid dataset, the available video data is first split into train/validate/test set.
This is done using prepCamVid.lua file.
dataDistributionCV.txt contains the detail about splitting of CamVid dataset.
These things are automatically run before training of the network.
LinkNet performance on both of the above dataset:
Datasets | Best IoU | Best iIoU |
---|---|---|
Cityscapes | 76.44 | 60.78 |
CamVid | 69.10 | 55.83 |
Pretrained models and confusion matrices for both datasets can be found in the latest release.
There are three model files present in models
folder:
bilinear interpolation
in residual connection because of which we were not able to run our trained model on TX1.A sample command to train network is given below:
th main.lua --datapath /Datasets/Cityscapes/ --cachepath /dataCache/cityscapes/ --dataset cs --model models/model.lua --save /Models/cityscapes/ --saveTrainConf --saveAll --plot
This software is released under a creative commons license which allows for personal and research use only. For a commercial license please contact the authors. You can view a license summary here: http://creativecommons.org/licenses/by-nc/4.0/