YanzuoLu / CFLD

[CVPR 2024 Highlight] Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
MIT License
165 stars 11 forks source link

CFLD arXiv

Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
Yanzuo Lu, Manlin Zhang, Andy J Ma, Xiaohua Xie, Jian-Huang Lai
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), June 17-21, 2024, Seattle, USA

qualitative

TL;DR

If you want to cite and compare with out method, please download the generated images from Google Drive here. (Including 256x176, 512x352 on DeepFashion, and 128x64 on Market-1501)

pipeline

NewsπŸ”₯πŸ”₯πŸ”₯

Preparation

Install Environment

conda env create -f environment.yaml

Download DeepFashion Dataset

Download Pre-trained Models

Training

For multi-gpu, run the following command by default.

bash scripts/multi_gpu/pose_transfer_train.sh 0,1,2,3,4,5,6,7

For single-gpu, run the following command by default.

bash scripts/single_gpu/pose_transfer_train.sh 0

For ablation studies, run the following command by example to specify configs.

bash scripts/multi_gpu/pose_transfer_train.sh 0,1,2,3,4,5,6,7 --config_file configs/ablation_study/no_app.yaml

Inference

For multi-gpu, run the following command by example to specify checkpoints.

bash scripts/multi_gpu/pose_transfer_test.sh 0,1,2,3,4,5,6,7 MODEL.PRETRAINED_PATH checkpoints

For single-gpu, run the following command by example to specify checkpoints.

bash scripts/single_gpu/pose_transfer_test.sh 0 MODEL.PRETRAINED_PATH checkpoints

Citation

@inproceedings{lu2024coarse,
  title={Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis},
  author={Lu, Yanzuo and Zhang, Manlin and Ma, Andy J and Xie, Xiaohua and Lai, Jian-Huang},
  booktitle={CVPR},
  year={2024}
}