TencentARC / PhotoMaker

PhotoMaker
https://photo-maker.github.io/
Other
8.63k stars 675 forks source link

## PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-md-dark.svg)](https://huggingface.co/papers/2312.04461) [[Paper](https://huggingface.co/papers/2312.04461)]   [[Project Page](https://photo-maker.github.io)]   [[Model Card](https://huggingface.co/TencentARC/PhotoMaker)]
[[πŸ€— Demo (Realistic)](https://huggingface.co/spaces/TencentARC/PhotoMaker)]   [[πŸ€— Demo (Stylization)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style)]
[[Replicate Demo (Realistic)](https://replicate.com/jd7h/photomaker)]   [[Replicate Demo (Stylization)](https://replicate.com/yorickvp/photomaker-style)] If the ID fidelity is not enough for you, please try our [stylization application](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style), you may be pleasantly surprised.

Official implementation of PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding.

🌠 Key Features:

  1. Rapid customization within seconds, with no additional LoRA training.
  2. Ensures impressive ID fidelity, offering diversity, promising text controllability, and high-quality generation.
  3. Can serve as an Adapter to collaborate with other Base Models alongside LoRA modules in community.

TencentARC%2FPhotoMaker | Trendshift

❗❗ Note: If there are any PhotoMaker based resources and applications, please leave them in the discussion and we will list them in the Related Resources section in README file. Now we know the implementation of Replicate, Windows, ComfyUI, and WebUI. Thank you all!

![photomaker_demo_fast](https://github.com/TencentARC/PhotoMaker/assets/21050959/e72cbf4d-938f-417d-b308-55e76a4bc5c8)

🚩 New Features/Updates


πŸ”₯ Examples

Realistic generation

Stylization generation

Note: only change the base model and add the LoRA modules for better stylization

πŸ”§ Dependencies and Installation

Install requirements

pip install -r requirements.txt

Install photomaker

pip install git+https://github.com/TencentARC/PhotoMaker.git


Then you can run the following command to use it
```python
from photomaker import PhotoMakerStableDiffusionXLPipeline

⏬ Download Models

The model will be automatically downloaded through the following two lines:

from huggingface_hub import hf_hub_download
photomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")

You can also choose to download manually from this url.

πŸ’» How to Test

Use like diffusers

Load base model

pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained( base_model_path, # can change to any base model based on SDXL torch_dtype=torch.bfloat16, use_safetensors=True, variant="fp16" ).to(device)

Load PhotoMaker checkpoint

pipe.load_photomaker_adapter( os.path.dirname(photomaker_path), subfolder="", weight_name=os.path.basename(photomaker_path), trigger_word="img" # define the trigger word )

pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)

Also can cooperate with other LoRA modules

pipe.load_lora_weights(os.path.dirname(lora_path), weight_name=lora_model_name, adapter_name="xl_more_art-full")

pipe.set_adapters(["photomaker", "xl_more_art-full"], adapter_weights=[1.0, 0.5])

pipe.fuse_lora()


- Input ID Images
```py
### define the input ID images
input_folder_name = './examples/newton_man'
image_basename_list = os.listdir(input_folder_name)
image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list])

input_id_images = []
for image_path in image_path_list:
    input_id_images.append(load_image(image_path))

Start a local gradio demo

Run the following command:

python gradio_demo/app.py

You could customize this script in this file.

If you want to run it on MAC, you should follow this Instruction and then run the app.py.

Usage Tips:

Related Resources

Replicate demo of PhotoMaker:

  1. Demo link, run PhotoMaker on replicate, provided by @yorickvP and @jd7h.
  2. Demo link (style version).

WebUI version of PhotoMaker:

  1. stable-diffusion-webui-forge: https://github.com/lllyasviel/stable-diffusion-webui-forge provided by @Lvmin Zhang
  2. Fooocus App: Fooocus-inswapper provided by @machineminded

Windows version of PhotoMaker:

  1. bmaltais/PhotoMaker by @bmaltais, easy to deploy PhotoMaker on Windows. The description can be found in this link.
  2. sdbds/PhotoMaker-for-windows by @sdbds.

ComfyUI:

  1. πŸ”₯ Official Implementation by ComfyUI: https://github.com/comfyanonymous/ComfyUI/commit/d1533d9c0f1dde192f738ef1b745b15f49f41e02
  2. https://github.com/ZHO-ZHO-ZHO/ComfyUI-PhotoMaker
  3. https://github.com/StartHua/Comfyui-Mine-PhotoMaker
  4. https://github.com/shiimizu/ComfyUI-PhotoMaker

Purely C/C++/CUDA version of PhotoMaker:

  1. stable-diffusion.cpp by @bssrdf.

Other Applications / Web Demos

  1. Wisemodel 始智 (Easy to use in China) https://wisemodel.cn/space/gradio/photomaker
  2. OpenXLab (Easy to use in China): https://openxlab.org.cn/apps/detail/camenduru/PhotoMaker Open in OpenXLab by @camenduru.
  3. Colab: https://github.com/camenduru/PhotoMaker-colab by @camenduru
  4. Monster API: https://monsterapi.ai/playground?model=photo-maker
  5. Pinokio: https://pinokio.computer/item?uri=https://github.com/cocktailpeanutlabs/photomaker

Graido demo in 45 lines

Provided by @Gradio

πŸ€— Acknowledgements

Disclaimer

This project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.

BibTeX

If you find PhotoMaker useful for your research and applications, please cite using this BibTeX:


@inproceedings{li2023photomaker,
  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}