SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis
A suitable conda environment named ldm
can be created
and activated with:
conda env create -f environment.yaml
conda activate ldm
A general list of all available checkpoints is available via model zoo. If you use any of these models in your work, we are always happy to receive a citation.
Download the pre-trained weights (5.7GB)
mkdir -p models/ldm/text2img-large/
wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
and sample with
python scripts/txt2img.py --scene_genie --data_source coco --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50
Our codebase builds heavily on Stable Diffusion / LDM codebase. Thanks for open-sourcing!
The implementation of the transformer encoder is from x-transformers by lucidrains.
@inproceedings{farshad2023scenegenie,
title={Scenegenie: Scene graph guided diffusion models for image synthesis},
author={Farshad, Azade and Yeganeh, Yousef and Chi, Yu and Shen, Chengzhi and Ommer, B{\"o}jrn and Navab, Nassir},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={88--98},
year={2023}
}