guanfuchen / semseg

常用的语义分割架构结构综述以及代码复现 华为媒体研究院 图文Caption、OCR识别、图视文多模态理解与生成相关方向工作或实习欢迎咨询 15757172165 https://guanfuchen.github.io/media/hw_zhaopin_20220724_tiny.jpg
768 stars 164 forks source link

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation #47

Open guanfuchen opened 5 years ago

guanfuchen commented 5 years ago

related paper

摘要
Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048×1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.

image


概述

本文的动机是语义分割通常同时要求空间信息和相当大的感受野,然而现代的方法通常为了获得实时推理速度妥协了空间分辨率导致了较差的性能。本文提出双向分割网络(BiSeNet),即一个空间Path(较小的stride)保留空间信息生成高分辨率特征,另一个上下文Path(快速下采样)高效获取感受野。基于这两个Path,引入了一种新的特征融合模块高效组合特征。

guanfuchen commented 5 years ago

architecture

image

image

image

image

image

guanfuchen commented 5 years ago

results

guanfuchen commented 5 years ago

conclusions

guanfuchen commented 5 years ago

implements

model dataset batch size data_augment solver lr mIoU on test mIoU on val
BiSeNet_resnet18 CamVid 1 True Adam 1e-4 Polynomial 0.51 0.62
model dataset dataset type Sky Building Pole Road Pavement Tree SignSymbol Fence Car Pedestrian Bicyclist Unlabelled
BiSeNet_resnet18 CamVid test 0.86 0.72 0.19 0.91 0.78 0.65 0.31 0.28 0.75 0.37 0.37 0.0
val
('Overall Acc : \t', 0.92)
('FreqW Acc : \t', 0.86)
('Mean Acc : \t', 0.77)
('Mean IoU : \t', 0.63)
('classes:', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
('class_iou_list:', [0.9, 0.85, 0.14, 0.96, 0.85, 0.9, 0.44, 0.64, 0.8, 0.46, 0.62, 0.0])

test
('Overall Acc : \t', 0.84)
('FreqW Acc : \t', 0.74)
('Mean Acc : \t', 0.68)
('Mean IoU : \t', 0.51)
('classes:', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
('class_iou_list:', [0.86, 0.72, 0.19, 0.91, 0.78, 0.65, 0.31, 0.28, 0.75, 0.37, 0.37, 0.0])

image

image

image

image

related reference