This repository contains an unofficial implementation of Google's paper Sketch-Guided Text-to-Image Diffusion Models. The goal of this project is to generate high-quality images from textual descriptions and corresponding sketches.
This implementation was inspired by and references the following repositories:
The Sketch-Guided Text-to-Image Diffusion Models project focuses on generating realistic images from textual descriptions and corresponding sketches.
git clone --recurse-submodules https://github.com/sangminkim-99/Sketch-Guided-Text-To-Image.git
cd Sketch-Guided-Text-To-Image
conda create -n sketch-guided-env python=3.9
conda activate sketch-guided-env
pip
to install the required packages:conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia # change to your own version of torch
pip install -r requirements.txt
Download necessary datasets.
Download some indoor images from ImageNet Dataset with ImageNet-Dataset-Downloader
chmod u+x scripts/download_imagenet_room_dataset.sh
./scripts/download_imagenet_room_dataset.sh
Edge Map Generation with pidinet
chmod u+x scripts/generate_edge_map.sh
./scripts/generate_edge_map.sh
python app.py --help
Currently supports --batch-size 1
only.
python app.py train-lep
python app.py sample --sketch-file-path {PATH} --prompt {PROMPT}
python app.py demo
[ ] Reproduce the bicycle example
[ ] Upload pretrained LEP
We would like to express our gratitude to the authors of the original paper and the developers of the referenced repositories for their valuable contributions, which served as the foundation for this implementation.
This is an unofficial implementation and is not affiliated with Google or the authors of the original paper.