wty-ustc / HairCLIP

[CVPR 2022] HairCLIP: Design Your Hair by Text and Reference Image
GNU Lesser General Public License v2.1
508 stars 66 forks source link

HairCLIP: Design Your Hair by Text and Reference Image (CVPR2022)

This repository hosts the official PyTorch implementation of the paper: "HairCLIP: Design Your Hair by Text and Reference Image".

Our single framework supports hairstyle and hair color editing individually or jointly, and conditional inputs can come from either image or text domain.

Tianyi Wei1, Dongdong Chen2, Wenbo Zhou1, Jing Liao3, Zhentao Tan1, Lu Yuan2, Weiming Zhang1, Nenghai Yu1
1University of Science and Technology of China, 2Microsoft Cloud AI, 3City University of Hong Kong

News

2023.10.12: We propose the more performant HairCLIPv2 that supports various interaction modalities, which is accepted by ICCV2023! 🎉
2022.03.19: Our testing code and pretrained model are released.
2022.03.09: Our training code is released.
2022.03.02: Our paper is accepted by CVPR2022 and the code will be released soon.

Web Demo

Upload your own image and try HairCLIP here with Replicate Replicate

Getting Started

Prerequisites

$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install ftfy regex tqdm
$ pip install git+https://github.com/openai/CLIP.git
$ pip install tensorflow-io

Pretrained Model

Please download the pre-trained model from the following link. The HairCLIP model contains the entire architecture, including the mapper and decoder weights. Path Description
HairCLIP Our pre-trained HairCLIP model.

If you wish to use the pretrained model for training or inference, you may do so using the flag --checkpoint_path.

Auxiliary Models and Latent Codes

In addition, we provide various auxiliary models and latent codes inverted by e4e needed for training your own HairCLIP model from scratch. Path Description
FFHQ StyleGAN StyleGAN model pretrained on FFHQ taken from rosinality with 1024x1024 output resolution.
IR-SE50 Model Pretrained IR-SE50 model taken from TreB1eN for use in our ID loss during HairCLIP training.
Train Set CelebA-HQ train set latent codes inverted by e4e.
Test Set CelebA-HQ test set latent codes inverted by e4e.

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models.

Training

Training HairCLIP

The main training script can be found in scripts/train.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs.
Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs.

Training the HairCLIP Mapper

cd mapper
python scripts/train.py \
--exp_dir=/path/to/experiment \
--hairstyle_description="hairstyle_list.txt" \
--color_description="purple, red, orange, yellow, green, blue, gray, brown, black, white, blond, pink" \
--latents_train_path=/path/to/train_faces.pt \
--latents_test_path=/path/to/test_faces.pt \
--hairstyle_ref_img_train_path=/path/to/celeba_hq_train \
--hairstyle_ref_img_test_path=/path/to/celeba_hq_val \
--color_ref_img_train_path=/path/to/celeba_hq_train \
--color_ref_img_test_path=/path/to/celeba_hq_val \
--color_ref_img_in_domain_path=/path/to/generated_hair_of_various colors \
--hairstyle_manipulation_prob=0.5 \
--color_manipulation_prob=0.2 \
--both_manipulation_prob=0.27 \
--hairstyle_text_manipulation_prob=0.5 \
--color_text_manipulation_prob=0.5 \
--color_in_domain_ref_manipulation_prob=0.25 \

Additional Notes

Acknowledgements

This code is based on StyleCLIP.

Citation

If you find our work useful for your research, please consider citing the following papers :)

@article{wei2022hairclip,
  title={Hairclip: Design your hair by text and reference image},
  author={Wei, Tianyi and Chen, Dongdong and Zhou, Wenbo and Liao, Jing and Tan, Zhentao and Yuan, Lu and Zhang, Weiming and Yu, Nenghai},
  journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022}
}
@article{wei2023hairclipv2,
  title={HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending},
  author={Wei, Tianyi and Chen, Dongdong and Zhou, Wenbo and Liao, Jing and Zhang, Weiming and Hua, Gang and Yu, Nenghai},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2023}
}