Official Pytorch implementation for our paper DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis by Ming Tao, Hao Tang, Fei Wu, Xiao-Yuan Jing, Bing-Kun Bao, Changsheng Xu.
[CVPR2023]Our new simple and effective model GALIP (paper link, code link) achieves comparable results to Large Pretrained Diffusion Models! Furthermore, our GALIP is training-efficient which only requires 3% training data, 6% learnable parameters. Our GALIP achieves ~120 x faster synthesis speed and can be inferred on CPU.
GALIP significantly lowers the hardware threshold for training and inference. We hope that more users can find the interesting of AIGC.
Clone this repo.
git clone https://github.com/tobran/DF-GAN
pip install -r requirements.txt
cd DF-GAN/code/
data/
data/birds/
data/coco/images/
cd DF-GAN/code/
bash scripts/train.sh ./cfg/bird.yml
bash scripts/train.sh ./cfg/coco.yml
If your training process is interrupted unexpectedly, set resume_epoch and resume_model_path in train.sh to resume training.
Our code supports automate FID evaluation during training, the results are stored in TensorBoard files under ./logs. You can change the test interval by changing test_interval in the YAML file.
tensorboard --logdir=./code/logs/bird/train --port 8166
tensorboard --logdir=./code/logs/coco/train --port 8177
./code/saved_models/bird/
./code/saved_models/coco/
We synthesize about 3w images from the test descriptions and evaluate the FID between synthesized images and test images of each dataset.
cd DF-GAN/code/
bash scripts/calc_FID.sh ./cfg/bird.yml
bash scripts/calc_FID.sh ./cfg/coco.yml
The released model achieves better performance than the CVPR paper version.
Model | CUB-FID↓ | COCO-FID↓ | NOP↓ |
---|---|---|---|
DF-GAN(paper) | 14.81 | 19.32 | 19M |
DF-GAN(pretrained model) | 12.10 | 15.41 | 18M |
cd DF-GAN/code/
bash scripts/sample.sh ./cfg/bird.yml
bash scripts/sample.sh ./cfg/coco.yml
bash scripts/sample.sh ./cfg/bird.yml
bash scripts/sample.sh ./cfg/coco.yml
The synthesized images are saved at ./code/samples.
If you find DF-GAN useful in your research, please consider citing our paper:
@inproceedings{tao2022df,
title={DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis},
author={Tao, Ming and Tang, Hao and Wu, Fei and Jing, Xiao-Yuan and Bao, Bing-Kun and Xu, Changsheng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={16515--16525},
year={2022}
}
The code is released for academic research use only. For commercial use, please contact Ming Tao.
Reference