Flux with controlnet and image2image support

BasimBashir commented 2 months ago

Is your feature request related to a problem? Please describe. I'm always frustrated when I can't leverage the powerful capabilities of Flux ControlNet with the img2img pipeline in the diffusers library. Flux ControlNet has recently emerged as a potential new state-of-the-art model for img2img tasks, and the lack of integration with diffusers limits the ability to explore and utilize this model within the ecosystem.

Describe the solution you'd like. I would love to see the diffusers package include support for a FluxControlNetImg2ImgPipeline, similar to the existing StableDiffusionXLControlNetImg2ImgPipeline. This addition would enable users to fully utilize the Flux ControlNet model for img2img tasks, allowing for more flexibility and innovation in generating images.

Describe alternatives you've considered. As an alternative, I've considered manually integrating Flux ControlNet with the existing pipelines, but this approach is complex and not as seamless or efficient as having native support within the diffusers library. Another option is waiting for updates or third-party implementations, but having official support would be the most reliable and user-friendly solution.

Additional context. Flux ControlNet was released recently and is quickly becoming a significant player in the AI image generation space. Integrating this with diffusers would be a huge benefit to the community and help keep the library at the forefront of AI advancements. Thank you for considering this feature!

a-r-r-o-w commented 2 months ago

Feel free to PR if you'd like :)

cc @asomoza

mrtpk commented 2 months ago

How to do this? I can put a PR if someone can guide me. Thanks.

asomoza commented 2 months ago

Hi, we should probably wait until we merge the img2img pipeline before this because we basically need to reuse it and add just the controlnet parts.

On how to contribute to diffusers, you can read how to do it here.

But in short, you have to clone the repo, make a branch, work on it and then push/publish it back, this will open a PR.

BasimBashir commented 1 month ago

Hi, we should probably wait until we merge the img2img pipeline before this because we basically need to reuse it and add just the controlnet parts.

On how to contribute to diffusers, you can read how to do it here.

But in short, you have to clone the repo, make a branch, work on it and then push/publish it back, this will open a PR.

Thank you for the update and the guidance! I noticed that with the release of diffusers==0.30.2, the img2img pipeline support is now available, which is great to see!

However, I don't have expertise in working directly with packages, so I would like to kindly request that the team consider adding native support for Flux ControlNet alongside the img2img functionality. This would make it much easier for users like myself to leverage the powerful capabilities of Flux ControlNet without needing to manually integrate it.

I appreciate all the work being done to keep the library up to date, and I hope this feature can be included in the future!

asomoza commented 1 month ago

Thank you for your kind words, we really appreciate them.

Native support probably will come when someone from the team has the bandwidth to do it, or maybe sooner if someone from the community can contribute it.

Just have a little patience, controlnets are really important for any model, so eventually it will be added.

yiyixuxu commented 1 month ago

@asomoza feel free to open an issue so that someone from the community can pick it up

mrtpk commented 1 month ago

@a-r-r-o-w added img2img for flux. Thanks a lot for this.

I just copied the code from commit here:

import torch
from diffusers import FluxImg2ImgPipeline
from diffusers.utils import load_image
device = "cuda"
pipe = FluxImg2ImgPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
pipe = pipe.to(device)
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
init_image = load_image(url).resize((1024, 1024))
prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
image = pipe(prompt=prompt, image=init_image, num_inference_steps=4, strength=0.95, guidance_scale=0.0).images[0]
image.save("cat.png")

works like a charm in the following environment.

## Using PyTorch 2.4.0 and CUDA 11.8
conda create -y -n diffusers_dev python=3.10
conda activate diffusers_dev
pip3 install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu118
pip3 install xformers==0.0.27.post2 --index-url https://download.pytorch.org/whl/cu118
pip3 install "huggingface_hub[cli]" accelerate protobuf sentencepiece
pip3 install git+https://github.com/huggingface/transformers.git

# pip3 install git+https://github.com/huggingface/diffusers.git
git clone https://github.com/huggingface/diffusers.git
cd diffusers
pip install -e ".[torch]"
python3 -c "from huggingface_hub.hf_api import HfFolder; HfFolder.save_token('your hf token')"

Though it did work without any error and give an image output, I doubt whether it is doing what is intended; Not sure as I'm a noob to diffusion-based image gen models.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

a-r-r-o-w commented 1 month ago

Closing this because these pipelines are now supported I believe. Feel free to open again if there's something remaining. Great work everyone!

huggingface / diffusers

Flux with controlnet and image2image support #9158