Updated v0.2 : fixed wrong nodes connecting to the florence2 node
Update 08-11-2024 : After a bit of fiddling around I found a way to reproduce the high quality image with controlnet as they demonstrate on their Github/HF page, I also found out that the 2 sampling methods can be combined and reorganized into a simpler and more efficient approach, I will update v0.3 soon to include all these changes.
I've created an All-in-One FluxDev workflow in ComfyUI that combines various techniques for generating images with the FluxDev model, including img-to-img and text-to-img. This workflow can use LoRAs, ControlNets, enabling negative prompting with Ksampler, dynamic thresholding, inpainting, and more. Please note that this is not the "correct" way of using these techniques, but rather my personal interpretation based on the available information.
Heavily Utilizing the USE Everywhere Node
This workflow relies heavily on the USE Everywhere node to make it as clean and efficient as possible for my daily generation needs. I'm sharing this workflow with the community to gather insights and suggestions for improvement. Feel free to experiment on your own.
ComfyUI/models/clip
): flux_text_encodersae.sft
in ComfyUI/models/vae
): ae.safetensorsComfyUI/models/controlnet
, open folder if needed)ComfyUI/models/loras
, open folder if needed)Low VRAM Setup:
Launch ComfyUI with the "--lowvram" argument (add to your .bat file) to offload the text encoder to CPU
At the time of creating this workflow, there are two available ControlNets and several LoRAs, including:
I've only tested the Canny and Realism LoRAs from XLabs-AI, and here are some key takeaways:
git checkout xlabs_flux_controlnet
The Ksampler workflow with dynamic thresholding is based on the official ComfyUI blog post. And I quote:
Note for both models you can either use SamplerCustomAdvanced
with BasicGuider
, or if you use KSampler
, set CFG
to 1
. You can use the new FluxGuidance
on the Dev model to control the distilled CFG-like value. (Setting this to 2 is recommended for realism or better style control) These models are trained to work without real CFG. That's not to say you can never use CFG though - in fact, the community has rapidly taken advantage of ComfyUI as an experimentation platform to test out a wide variety of tricks to get the most out of the new models. (Such as using the Dynamic Thresholding custom node, or using the new FluxGuidance
built-in node to compensate, and enable CFG and negative prompting. There's also ModelSamplingFlux
built-in to control Flux sigma shift, though its benefits are more limited.)
Keep in mind that this is my own interpretation and feel free to make any changes and experiment.
0 shot non cherry pick demo with this sampling method:
You can find the repo here.
The Pixel Resolution Calculator is a custom node I developed with the help of LLama3.1 yesterday (Yes I have no programming skills, learning it on the way from scratch). It's just a very simple node that generates the closest "latent-friendly" pixel resolution from your megapixel and aspect ratio of choice. I took inspiration from the ImageScaleToTotalPixels node from the original Flux demo workflow, as everyone seems to talk about pixel resolution instead of width and height pixel count like in SDXL. There is also a node to convert a latent sample input to width and height pixel count.
An upscaling workflow is also included. It uses the Iterative Upscale (Image) node from the Impact pack and tiled diffusion to create a high-res fix like upscaling and detailing node group with the upscale model of your choice. You can also do denoise, CFG, and step scheduling with the PK hook.
Demo image compare here.
Since there is yet no inpainting model trained for Flux, only the simplest form of inpainting can be achieved here. You can also try to incorporate ControlNets, but pay attention to the square-based resolution and guidance scale (4) .
Just some simples nodes to run ollama and Florence2 for using vision LLm for detail captioning and to get prompt insights, I'm using LLaVa 13B and Florence2 large in the demo, You will need Ollama , Ollama ComfyUI and Florence2 ComfyUI nodes, see the links for detailed usage and installation guide.
More detailed guide will be added if people find it hard to use....
Flux is such a flexible model, and given that it's a 1st version, it's very impressive. Within two weeks of releasing, there are already ControlNets and LoRAs available, which shows how much the community loves this model. I'm now looking forward to some inpainting models. And most importantly, Matteo, please release an iPadapter for Flux.... please , the one missing puzzle and I'm complete....
Happy Generating!
P.S. I'm including the prompt I use with LLama3.1 to help me do spell checks and gramma checks for this very repo for no reason:
Act as a professional writer with a strong writing skill set and a deep understanding of writing in general. Assist users in rewriting, reformatting, and performing grammar and spell checks upon request. Your tasks should include:
Additionally, please:
Do you understand these requirements?