klimaleksus / stable-diffusion-webui-embedding-merge

Extension for AUTOMATIC1111/stable-diffusion-webui for creating and merging Textual Inversion embeddings at runtime from string literals.
The Unlicense
106 stars 10 forks source link

Extend the token length if at all possible! #6

Closed duskfallcrew closed 6 months ago

duskfallcrew commented 12 months ago

Dunno if it IS POSSIBLE, and i'm aware this is sort of updated at will but you saved me a TON OF TIME AND I MADE LIKE 10 + embeds last night with this.

Some prompts have 100-200 tokens, and if possible it'd be interesting to see if you COULD in theory extend the token length with this plugin.

<3 Much adoration. ThANK YOU!!

aleksusklim commented 12 months ago

WebUI splits your long prompts by chunks of 75 tokens each. It tries to do this intelligently, for example, when reaching 75th token it backtracks to find a nearest comma – to split at it, rather than in the middle of a sentence. (You can control, how much it can backtrack by dedicated option in Settings of WebUI, by default its 20).

When you reach this limit, your xx/75 token counter becomes xx/150. That's how you can check, how many "chunks" you have.

I don't recommend relying on this behavior blindly. Much better is to use the keyword BREAK where you want your prompt to be cut explicitly. For example, you prepare your main prompt – I mean, the part that specifies the composition – and put BREAK after it.

This way it would be much better for your final result, because those chunks are working separately but together. It is good to keep similar things in one chunk, rather than at random.

So, for example, your long prompt can be: a photo of beautiful blue-eyed girl in long red dress at summer sunset BREAK best_quality, 1girl, blue_eyes, sunset, long_dress, blue_dress, summer BREAK symmetric face, fantastic details, by greg rutkowski, award-winning masterpiece – this will separate "text prompt" from "tags" and from "enhancements". Also, this is feasible not only when you've reached 75 limit, but even when your prompts are short too. Worth to experiment!

Now, where Embedding Merge can help? You might try to save tokens with it, when you are slightly over the limit already. For example, you could try to use <'long'+'red'> dress, <'blue'+'eyes'>, etc. Sometimes it works, sometimes not.

An embedding alone cannot be larger than 75 tokens. It is physically impossible! But you can try to fit there many things, for example <'sky on background, clouds, fog'+'funny bear is dancing on the road'> – this might work or might not. Generally, it is better to use BREAK for this purpose, rather than juggling with embeddings like that.

Actually, my Embedding Merge is a big failure, because I thought it would solve a different problem: binding properties to objects. Here are some more ideas on the subject: https://github.com/hnmr293/sd-webui-cutoff/issues/12#issuecomment-1567963993 Since it is not working with Embedding Merge alone, it greatly reduces applicable use-cases for EM.

There are only two unique features that Embedding Merge actually gives:

  1. Ability to merge subjects (as I showed in https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/7659)
  2. Ability to zero-out a part of the prompt, or lower its weights <'term'*X> up to pure X=0, and not as attention mechanism which cannot de-facto drop to zero even at (X:0).

Generally, EM is more for research rather than for everyday usage. Personally, I keep it as a joker in my sleeve: I try to never reach 75 tokens in my prompts (because I might need BREAK later when inpainting: https://github.com/klimaleksus/stable-diffusion-webui-disable-inpainting-overlay#proposed-workflow, which works better if the break is only one there), but if I really cannot fit – I know that I could try to merge something around, to keep myself lower than 75.

aleksusklim commented 12 months ago

Related: https://github.com/klimaleksus/stable-diffusion-webui-embedding-merge/issues/4#issuecomment-1520708277

duskfallcrew commented 11 months ago

True! And actually lmao I use EM a TON in practice - XD i'm one of those "WORK DUMBER NOT SMARTER" model makers :D

duskfallcrew commented 6 months ago

Closing this beause i think i FINALLY UNDERSTAND how to get around this :D