BlenderNeko / ComfyUI_Cutoff

cutoff implementation for ComfyUI
GNU General Public License v3.0
347 stars 21 forks source link
comfyui stable-diffusion

Cutoff for ComfyUI

screenshot of workflow

what is cutoff?

cutoff is a script/extension for the Automatic1111 webui that lets users limit the effect certain attributes have on specified subsets of the prompt. I.e. when the prompt is a cute girl, white shirt with green tie, red shoes, blue hair, yellow eyes, pink skirt, cutoff lets you specify that the word blue belongs to the hair and not the shoes, and green to the tie and not the skirt, etc. This is an implementation of cutoff in the form of 3 nodes that can be used in ComfyUI.

how does this work?

When you provide stable diffusion with some text, that text gets tokenized and CLIP creates a vector (embedding) for each token in the text. So if we have a prompt containing "blue hair, yellow eyes" some of the vectors coming out of CLIP will correspond to the "blue hair" part, and some to the "yellow eyes". When CLIP does this it tries to take the context of the entire sentence into consideration. Unfortunately CLIP isn't always as great at figuring out that the "blue" in "blue hair" should really only modify the noun "hair" and not the noun "eyes" a bit further in the sentence.

So how do we deal with this? we can mask out the tokens corresponding to "blue" and ask CLIP to create another embedding. In this new embedding we have a set of vectors corresponding to "yellow eyes" that are not affected by "blue", because blue wasn't part of the tokens. If we then take the difference between our original vectors and these new vectors we now have a direction we can travel in for the eyes to become more affected by "yellow" and less by "blue". If we do this for all the color relations in text we can travel to an embedding where each of these relations are more isolated. Of course this effect isn't limited to just colors.

ComfyUI nodes

To achieve all of this, the following 4 nodes are introduced:

Cutoff BasePrompt: this node takes the full original prompt

Cutoff Set Region: this node sets a "region" of influence for specific target words, and comes with the following inputs:

Cutoff Regions To Conditioning: this node converts the base prompt and regions into an actual conditioning to be used in the rest of ComfyUI, and comes with the following inputs:

Cutoff Regions To Conditioning (ADV): provides the same functionality as the above node but also provides options on how to interpret prompt weighting. More on these settings can be found here.

You can find these nodes under conditioning>cutoff

SDXL

The nodes won't throw any errors when used with SDXL, but at least for 0.9 I didn't found it to be working that well.

Finally, Here are some example images that you can load into ComfyUI:

first example generation of a cute girl, white shirt with green tie, red shoes, blue hair, yellow eyes, pink skirt using cutoff first example generation of a cute girl, white shirt with green tie, red shoes, blue hair, yellow eyes, pink skirt using cutoff first example generation of a cute girl, white shirt with green tie, red shoes, blue hair, yellow eyes, pink skirt using cutoff first example generation of a cute girl, white shirt with green tie, red shoes, blue hair, yellow eyes, pink skirt using cutoff