haoosz / ViCo

Official PyTorch codes for the paper: "ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation"
MIT License
238 stars 15 forks source link
personalized-generation text-to-image-diffusion

ViCo

arXiv License

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation

teaser

⏳ To Do

⚙️ Set-up

Create a conda environment vico using

conda env create -f environment.yaml
conda activate vico

⏬ Download

Download the pretrained stable diffusion v1-4 under models/ldm/stable-diffusion-v1.

We provide the pretrained checkpoints at 300, 350, and 400 steps of 8 objects. You can download the sample images and their corresponding pretrained checkpoints. You can also download the data of any object:

Object Sample images Checkpoints
barn image ckpt
batman image ckpt
clock image ckpt
dog7 image ckpt
monster toy image ckpt
pink sunglasses image ckpt
teddybear image ckpt
wooden pot image ckpt

Datasets are originally collected and provided by Textual Inversion, DreamBooth, and Custom Diffsuion. You can find all datasets used for quantitaive comparison in our paper.

🚀 Inference

Before running the inference command, please set:

💻 Training

Before running the training command, please set:

📖 Citation

If you use this code in your research, please consider citing our paper:

@inproceedings{Hao2023ViCo,
  title={ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation},
  author={Shaozhe Hao and Kai Han and Shihao Zhao and Kwan-Yee K. Wong},
  year={2023}
}

💐 Acknowledgements

This code repository is based on the great work of Textual Inversion. Thanks!