WeilunWang / semantic-diffusion-model

Official Implementation of Semantic Image Synthesis via Diffusion Models
227 stars 24 forks source link

Semantic Image Synthesis via Diffusion Models (SDM)

 

 

 

Paper

Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li,

Abstract

We provide our PyTorch implementation of Semantic Image Synthesis via Diffusion Models (SDM). In this paper, we propose a novel framework based on DDPM for semantic image synthesis. Unlike previous conditional diffusion model directly feeds the semantic layout and noisy image as input to a U-Net structure, which may not fully leverage the information in the input semantic mask, our framework processes semantic layout and noisy image differently. It feeds noisy image to the encoder of the U-Net structure while the semantic layout to the decoder by multi-layer spatially-adaptive normalization operators. To further improve the generation quality and semantic interpretability in semantic image synthesis, we introduce the classifier-free guidance sampling strategy, which acknowledge the scores of an unconditional model for sampling process. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed method, achieving state-of-the-art performance in terms of fidelity (FID) and diversity (LPIPS).

Example Results

Prerequisites

Dataset Preparation

The Cityscapes and ADE20K dataset can be downloaded and prepared following SPADE. The CelebAMask-HQ can be downloaded from CelebAMask-HQ, you need to to integrate the separated annotations into an image file (the format like other datasets, e.g. Cityscapes and ADE20K).

NEGCUT Training and Test

Please refer to the 'scripts/ade20.sh' for more details.

Apply a pre-trained NEGCUT model and evaluate

Pretrained Models (to be updated)

Dataset Download link
Cityscapes Visual results
ADE20K Checkpoint | Visual results
CelebAMask-HQ Checkpoint | Visual results
COCO-Stuff Checkpoint | Visual results

Acknowledge

Our code is developed based on guided-diffusion. We also thank "test_with_FID.py" in OASIS for FID computation, "lpips.py" in stargan-v2 for LPIPS computation.