FreeMask

This codebase provides the official PyTorch implementation of our NeurIPS 2023 paper:

FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
Lihe Yang, Xiaogang Xu, Bingyi Kang, Yinghuan Shi, Hengshuang Zhao
In Conference on Neural Information Processing Systems (NeurIPS), 2023
[Paper] [Datasets] [Models] [Logs] [BibTeX]

TL;DR

We generate diverse synthetic images from semantic masks, and use these synthetic pairs to boost the fully-supervised semantic segmentation performance.

Results

ADE20K

Model	Backbone	Real Images	+ Synthetic Images	Gain ($\Delta$)	Download
Mask2Former	Swin-T	48.7	52.0	+3.3	ckpt \| log
Mask2Former	Swin-S	51.6	53.3	+1.7	ckpt \| log
Mask2Former	Swin-B	52.4	53.7	+1.3	ckpt \| log
SegFormer	MiT-B2	45.6	47.9	+2.3	ckpt \| log
SegFormer	MiT-B4	48.5	50.6	+2.1	ckpt \| log
Segmenter	ViT-S	46.2	47.9	+1.7	ckpt \| log
Segmenter	ViT-B	49.6	51.1	+1.5	ckpt \| log

COCO-Stuff-164K

Model	Backbone	Real Images	+ Synthetic Images	Gain ($\Delta$)	Download
Mask2Former	Swin-T	44.5	46.4	+1.9	ckpt \| log
Mask2Former	Swin-S	46.8	47.6	+0.8	ckpt \| log
SegFormer	MiT-B2	43.5	44.2	+0.7	ckpt \| log
SegFormer	MiT-B4	45.8	46.6	+0.8	ckpt \| log
Segmenter	ViT-S	43.5	44.8	+1.3	ckpt \| log
Segmenter	ViT-B	46.0	47.5	+1.5	ckpt \| log

High-Quality Synthetic Datasets

We share our already processed synthetic ADE20K and COCO-Stuff-164K datasets below. The ADE20K-Synthetic dataset is 20x larger than its real counterpart, while the COCO-Synthetic is 6x larger than its real counterpart.

Getting Started

Installation

Install MMSegmentation:

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0rc4"

Download Real Datasets

Follow the instructions to download the ADE20K and COCO-Stuff-164K real datasets. The COCO annotations need to be pre-processed following the instructions.

Download Synthetic Datasets

Please see above.

Note:

Please modify the dataset path data_root (real images) and data_root_syn (synthetic images) in config files.
If you use SegFormer, please convert the pre-trained MiT backbones following this, and put mit_b2.pth, mit_b4.pth under pretrain directory.

Usage

bash dist_train.sh <config> 8

Generate and Pre-process Synthetic Images (Optional)

We have provided the processed synthetic images above. You can directly use them to train a stronger segmentation model. However, if you want to generate additional images by yourself, we introduce the generation and pre-processing steps below.

Generate Synthetic Images

We strictly follow FreestyleNet for initial image generation. Please refer to their instructions. You can change the random seed to produce multiple synthetic images from a semantic mask.

Pre-process Synthetic Images

Our work focuses on this part.

Filter out Noisy Synthetic Regions

python preprocess/filter.py <config> <checkpoint> --real-img-path <> --real-mask-path <> --syn-img-path <> --syn-mask-path <> --filtered-mask-path <>

We use the pre-trained SegFormer-B4 model to calculate class-wise mean loss on real images and then filter out noisy synthetic regions.

Re-sample Synthetic Images based on Mask-level Hardness

python preprocess/resample.py --real-mask-path <> --syn-img-path <> --syn-mask-path <> --resampled-syn-img-path <> --resampled-syn-mask-path <>

Acknowledgment

We thank FreestyleNet for providing their mask-to-image synthesis models.

Citation

If you find this project useful, please consider citing:


@inproceedings{freemask,
  title={FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models},
  author={Yang, Lihe and Xu, Xiaogang and Kang, Bingyi and Shi, Yinghuan and Zhao, Hengshuang},
  booktitle={NeurIPS},
  year={2023}
}

LiheYoung / FreeMask

readme