LiheYoung / FreeMask

[NeurIPS 2023] FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
https://arxiv.org/abs/2310.15160
MIT License
129 stars 1 forks source link
generative-model perception semantic-segmentation synthetic-data

FreeMask

This codebase provides the official PyTorch implementation of our NeurIPS 2023 paper:

FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
Lihe Yang, Xiaogang Xu, Bingyi Kang, Yinghuan Shi, Hengshuang Zhao
In Conference on Neural Information Processing Systems (NeurIPS), 2023
[Paper] [Datasets] [Models] [Logs] [BibTeX]

TL;DR

We generate diverse synthetic images from semantic masks, and use these synthetic pairs to boost the fully-supervised semantic segmentation performance.


Results

ADE20K

Model Backbone Real Images + Synthetic Images Gain ($\Delta$) Download
Mask2Former Swin-T 48.7 52.0 +3.3 ckpt | log
Mask2Former Swin-S 51.6 53.3 +1.7 ckpt | log
Mask2Former Swin-B 52.4 53.7 +1.3 ckpt | log
SegFormer MiT-B2 45.6 47.9 +2.3 ckpt | log
SegFormer MiT-B4 48.5 50.6 +2.1 ckpt | log
Segmenter ViT-S 46.2 47.9 +1.7 ckpt | log
Segmenter ViT-B 49.6 51.1 +1.5 ckpt | log

COCO-Stuff-164K

Model Backbone Real Images + Synthetic Images Gain ($\Delta$) Download
Mask2Former Swin-T 44.5 46.4 +1.9 ckpt | log
Mask2Former Swin-S 46.8 47.6 +0.8 ckpt | log
SegFormer MiT-B2 43.5 44.2 +0.7 ckpt | log
SegFormer MiT-B4 45.8 46.6 +0.8 ckpt | log
Segmenter ViT-S 43.5 44.8 +1.3 ckpt | log
Segmenter ViT-B 46.0 47.5 +1.5 ckpt | log

High-Quality Synthetic Datasets

We share our already processed synthetic ADE20K and COCO-Stuff-164K datasets below. The ADE20K-Synthetic dataset is 20x larger than its real counterpart, while the COCO-Synthetic is 6x larger than its real counterpart.

Getting Started

Installation

Install MMSegmentation:

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0rc4"

Download Real Datasets

Follow the instructions to download the ADE20K and COCO-Stuff-164K real datasets. The COCO annotations need to be pre-processed following the instructions.

Download Synthetic Datasets

Please see above.

Note:

Usage

bash dist_train.sh <config> 8

Generate and Pre-process Synthetic Images (Optional)

We have provided the processed synthetic images above. You can directly use them to train a stronger segmentation model. However, if you want to generate additional images by yourself, we introduce the generation and pre-processing steps below.

Generate Synthetic Images

We strictly follow FreestyleNet for initial image generation. Please refer to their instructions. You can change the random seed to produce multiple synthetic images from a semantic mask.

Pre-process Synthetic Images

Our work focuses on this part.

Filter out Noisy Synthetic Regions

python preprocess/filter.py <config> <checkpoint> --real-img-path <> --real-mask-path <> --syn-img-path <> --syn-mask-path <> --filtered-mask-path <>

We use the pre-trained SegFormer-B4 model to calculate class-wise mean loss on real images and then filter out noisy synthetic regions.

Re-sample Synthetic Images based on Mask-level Hardness

python preprocess/resample.py --real-mask-path <> --syn-img-path <> --syn-mask-path <> --resampled-syn-img-path <> --resampled-syn-mask-path <> 

Acknowledgment

We thank FreestyleNet for providing their mask-to-image synthesis models.

Citation

If you find this project useful, please consider citing:


@inproceedings{freemask,
  title={FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models},
  author={Yang, Lihe and Xu, Xiaogang and Kang, Bingyi and Shi, Yinghuan and Zhao, Hengshuang},
  booktitle={NeurIPS},
  year={2023}
}