Deeperlab

This project aims at providing a fast, modular reference implementation for semantic segmentation models using PyTorch.

demo image

Highlights

Distributed Training: >60% Thank you ycszen, from his struct faster than the multi-thread parallel method(nn.DataParallel), we use the multi-processing parallel method.
Multi-GPU training and inference: support different manners of inference.
Provides pre-trained models and implement different semantic segmentation models.

Prerequisites

PyTorch 1.0
- pip3 install torch torchvision
Easydict
- pip3 install easydict
Apex
Ninja
- sudo apt-get install ninja-build
tqdm
- pip3 install tqdm

Pretrain Model

xception-71(pretrain)

Model Zoo

Supported Model

deeperlab(CVPR2019)

Performance and Benchmarks

SS:Single Scale MSF:Multi-scale + Flip

PASCAL VOC 2012(SBD and Never SBD)

because I only realize the segmentation part,I tested its results on voc Method	Backbone	TrainSet	EvalSet	Mean IoU(ss)	Mean IoU(msf)
deeperlab(ours+SBD)	R101_v1c	train_aug	val	79.71	80.26
deeperlab(ours)	R101_v1c	train_aug	val	73.28	74.11

To Do

[ ] Detection part
Link

we must build the env for training
```
make link
make others
```
soft link to data,pretrain,log,logger

Training

create the config file of dataset:train.txt, val.txt, test.txt
file structure：(split with tab)
```
path-of-the-image   path-of-the-groundtruth
```
modify the config.py according to your requirements
train a network:

Distributed Training

We use the official torch.distributed.launch in order to launch multi-gpu training. This utility function from PyTorch spawns as many Python processes as the number of GPUs we want to use, and each Python process will only use a single GPU.

For each experiment, you can just run this script:

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py

Non-distributed Training

The above performance are all conducted based on the non-distributed training. For each experiment, you can just run this script:

bash train.sh

In train.sh, the argument of d means the GPU you want to use.

Inference

In the evaluator, we have implemented the multi-gpu inference base on the multi-process. In the inference phase, the function will spawns as many Python processes as the number of GPUs we want to use, and each Python process will handle a subset of the whole evaluation dataset on a single GPU.

evaluate a trained network on the validation set:
```
bash eval.sh
```

input arguments in shell:

usage: -e epoch_idx -d device_idx -c save_csv [--verbose ] 
[--show_image] [--save_path Pred_Save_Path]

Segmentation-torch

if you are interested my algorithm, you can see my realized segmentation tool(dfn,deeperlab,deeplabv3 plus and so on):

segmentation-torch

Be Care for

because my device is 1080, we can't use 7*7 conv in two 4096 channel due to out of memory. so if you use it. you can change it in model/deeperlab.py

lingtengqiu / Deeperlab-pytorch

readme