swoook / ddrnet

Cloned from chenjun2hao/DDRNet (https://github.com/chenjun2hao/DDRNet.pytorch).
Other
1 stars 0 forks source link

❗ This is cloned repository!

This repository is cloned from chenjun2hao/DDRNet.pytorch and modified for research

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes

Introduction

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data!on single 2080Ti GPU, DDRNet-23-slim yields 77.4% mIoU at 109 FPS on Cityscapes test set and 74.4% mIoU at 230 FPS on CamVid test set.

The code mainly borrows from HRNet-Semantic-Segmentation OCR and the official repository, thanks for their work.

hrnet

A comparison of speed-accuracy trade-off on Cityscapes test set.

Requirements

torch>=1.7.0
cudatoolkit>=10.2

Cityscapes Data Preparation

  1. Download two files below from Cityscapes.to the \${CITYSCAPES_ROOT}

    • leftImg8bit_trainvaltest.zip
    • gtFine_trainvaltest.zip
  2. Unzip them

  3. Rename the folders like below

    └── cityscapes
     ├── leftImg8bit
         ├── test
         ├── train
         └── val
     └── gtFine
         ├── test
         ├── train
         └── val
  4. Update some properties in {REPO_ROOT}/experiments/cityscapes/${MODEL_YAML} like below

    DATASET:
     DATASET: cityscapes
     ROOT: ${CITYSCAPES_ROOT}
     TEST_SET: 'cityscapes/list/test.lst'
     TRAIN_SET: 'cityscapes/list/train.lst'
     ...

Pretrained Models

  1. Download the pretrained model to \${MODEL_DIR}

  2. Update MODEL.PRETRAINED and TEST.MODEL_FILE in {REPO_ROOT}/experiments/cityscapes/${MODEL_YAML} like below

    ...
    MODEL:
     ...
     PRETRAINED: "${MODEL_DIR}/${MODEL_NAME}.pth"
     ALIGN_CORNERS: false
     ...
    TEST:
     ...
     MODEL_FILE: "${MODEL_DIR}/${MODEL_NAME}.pth"
     ...

Validation

cd ${REPO_ROOT}
python tools/eval.py --cfg experiments/cityscapes/ddrnet23_slim.yaml
model OHEM Multi-scale Flip mIoU FPS E2E Latency (s) Link
DDRNet23_slim Yes No No 77.83 91.31 0.062 official
DDRNet23_slim Yes No Yes 78.42 TBD TBD official
DDRNet23 Yes No No 79.51 TBD TBD official
DDRNet23 Yes No Yes 79.98 TBD TBD official

mIoU denotes an mIoU on Cityscapes validation set.

FPS is measured by following the test code provided by SwiftNet. (Refer to speed_test from lib/utils/utils.py for more details.)

E2E Latency denotes an end-to-end latency including pre/post-processing.

FPS and latency are measured with batch size 1 on RTX 2080Ti GPU and Threadripper 2950X CPU.

Note

TRAIN

download the imagenet pretrained model, and then train the model with 2 nvidia-3080

cd ${PROJECT}
python -m torch.distributed.launch --nproc_per_node=2 tools/train.py --cfg experiments/cityscapes/ddrnet23_slim.yaml

the own trained model coming soon

OWN model

model Train Set Test Set OHEM Multi-scale Flip mIoU Link
DDRNet23_slim train eval Yes No Yes 77.77 Baidu/password:it2s
DDRNet23_slim train eval Yes Yes Yes 79.57 Baidu/password:it2s
DDRNet23 train eval Yes No Yes ~ None
DDRNet39 train eval Yes No Yes ~ None

Note

Reference

[1] HRNet-Semantic-Segmentation OCR branch

[2] the official repository