Image demos can be found on the HiCo. Some of them are contributed by the community. You can customize your own personalized generation using the following reasoning code.
We tested our inference code on a machine with a 24GB 3090 GPU and CUDA environment version 12.1.
git clone https://github.com/360CVGroup/HiCo_T2I.git
cd HiCo
conda create -n HiCo python=3.10
conda activate HiCo
pip install -r requirements.txt
cd diffusers
pip install .
git lfs install
# HiCo checkpoint
git clone https://huggingface.co/qihoo360/HiCo_T2I models/controlnet
# stable-diffusion-v1-5
git clone https://huggingface.co/krnl/realisticVisionV51_v51VAE models/realisticVisionV51_v51VAE
CUDA_VISIBLE_DEVICES=0 infer-avg.py
The json structure for dataset is:
dataset
├──base_info
│ ├──id
│ ├──width
│ ├──height
│ ├──f_path
├──caption
├──obj_nums
├──img_size
│ ├──H
│ ├──W
├──path_img (f_path)
├──list_bbox_info
│ ├──subcaption
│ ├──coordinates(x1,y1,x2,y2)
│ │......
├──crop_location
Then you can train the code.
accelerate launch train_hico.py
@misc{cheng2024hicohierarchicalcontrollablediffusion,
title={HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation},
author={Bo Cheng and Yuhang Ma and Liebucha Wu and Shanyuan Liu and Ao Ma and Xiaoyu Wu and Dawei Leng and Yuhui Yin},
year={2024},
eprint={2410.14324},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.14324},
}
This project is licensed under the Apache License (Version 2.0).