midasklr / DDRNet.TensorRT

TensorRT of DDRNet for real-time segmentation
38 stars 11 forks source link

DDRNet

TensorRT implementation of the official DDRNet

DDRNet-23-slim outperform other light weight segmentation method,see

Compile&Run

for INT8 support:

#define USE_INT8  // comment out this if want to use INT8
//#define USE_FP16  // comment out this if want to use FP32

mkdir "calib" and put around 1k images(cityscape val/test images) into folder "calib".

FPS

Test on RTX2070

model input FPS
Pytorch-aug (3,1024,1024) 107
Pytorch-no-aug (3,1024,1024) 108
TensorRT-FP32 (3,1024,1024) 117
TensorRT-FP16 (3,1024,1024) 215
TensorRT-INT8 (3,1024,1024) 232

Pytorch-aug means augment=True.

Difference with official

we use Upsample with "nearest" other than "bilinear",which may lead to lower accuracy .

Finetune with "nearest" upsample may recover the accuracy.

Here we convert from the official model directly.

Train

  1. refer to:https://github.com/chenjun2hao/DDRNet.pytorch
  2. generate wts model with getwts.py

Train customer data

https://github.com/midasklr/DDRNet.Pytorch wirte your own dataset and finetune the model with cityscape.