Haoming02 / sd-forge-couple

An Extension for Forge Webui that implements Attention Couple
GNU General Public License v3.0
210 stars 11 forks source link

Usage with masks #2

Closed ramyma closed 6 months ago

ramyma commented 6 months ago

Can we add mask support to define the attention regions with finer control?

I'm looking for a good regional prompting extension to integrate with my interface A8R8

MoreColors123 commented 6 months ago

I love your new extension, been testing it thorougly and I think it makes so much more possible.

But I too would love to be able to define more than only either rows (horizontal) and columns (vertical). It would be great to have sections/regions, like in the original regional prompting extension, which has always been a bit of a hassle to use though (UX wise). Something like the recently published gligen app would be nice. https://github.com/mut-ex/gligen-gui This works only with sd 1.5 though.

Haoming02 commented 6 months ago

I used another Extension called Latent Couple before, and the UI/UX was also a hassle to use. That's why I wrtie an automated approach first.

The gligen UI does indeed look better though. I'll work on it.

Haoming02 commented 6 months ago

Hello, please give the adv-mapping branch a try!

arcusmaximus commented 6 months ago

It works, but I do of course hope that the UI is just a stand-in. Manually typing the coordinates of every region is for masoch... ComfyUI users :D

Personally, I don't mind the way Regional Prompter does it: a single textbox (= no constant switching between mouse and keyboard) with only relative region sizes, meaning that if you want to make one region larger, you only need to adjust one size number - not position and size numbers for all the regions around it.

(Speaking of Regional Prompter: it replaced Latent Couple a long time ago and works with attention just like your Forge Couple. Given that, I'm not sure if it's still relevant to mention the speed advantage over Latent Couple on the readme page.)

Leaving that aside - the best option would indeed be boxes that you can drag around with the mouse, with a clear link between boxes and prompts (not present in the adv-mapping branch right now). FWIW, not just the Gligen GUI does this, but the Regional Prompt Control feature of the original (non-Forge) MultiDiffusion Upscaler too.

Apart from all that, though, thank you for taking the time to build Forge Couple. Regional Prompter has so far been the single most important extension for me, but with the many changes in Forge, its stability there has only been getting worse - so it's good to have an alternative.

MoreColors123 commented 6 months ago

I needed a bit to understand it, but then it worked - with SDXL Lightning (yay!) Thanks for your effort! Have you got an idea for a UI? The gligen one is awesome for example. Just being able to draw rectangles which are labeled with the corresponding prompt line number might be one way, I think? Edit: Also taking into account a global prompt of course, if used.

Prompt: grafik

Mapping Preview: grafik

Result: 00023-1912865131

MoreColors123 commented 6 months ago

So yeah this works really great. I'm thrilled!

Prompt: photograph of a landscape, sun, dawn photograph of a landscape, snow white mountains photograph of a landscape, house photograph of a landscape, children playing

grafik

00036-1618033805

Haoming02 commented 6 months ago

Welp, will have to think of a good way to create and position masks. (btw, I've actually never used Regional Prompter, only Latent Couple XD)

Though, it will have to wait for next next week, as I'll be on vaca next week...

Glad to see it works at least~

MoreColors123 commented 6 months ago

I've been thinking a bit about this. I think most use cases will not use more than 2 or maybe 3 rows. Maybe make a button "create row" and a keyword like SEP for line separation. So suppose i want 3 rows, i could so this

Sky SEP Man Woman Door SEP Couch Carpet

Would create this raster: IMG_20240403_070313_edit_963902826996143.jpg

And by doubling a line, which i already successfully used in the existing extension for extending a region, you can adjust proportions. So this:

Sky Sky Sun SEP House Lake Lake

would do this: IMG_20240403_071303_edit_964237543531508.jpg

I think (hope) it would be easy to implement and definitely easy to use.

arcusmaximus commented 6 months ago

Now try to recreate the Gligen frog picture with that setup. Not ideal to say the least.

Drawing boxes would be the perfect combination between ease of use and flexibility. Even better if we could set a background image (such as the Canny line art that we're also putting in ControlNet).

Later on, the author could maybe also look into emulating Regional Prompter's prompt-based regions, which remove the need to specify any regions at all for the above color bleeding example.

MoreColors123 commented 6 months ago

You're right, that would be hard to recreate. It all depends on how much effort the author wants to put into it. I just thought of a first step into something actually usable, compared to the mindbending matrix we have available now. :) Drawing resizable, draggable boxes would indeed be the best. Every box having its own controlnet input would be even mindblowing.

ramyma commented 6 months ago

I believe adding boxes that can be edited directly on the canvas, or using masks on the same canvas would be the most straightforward solution in terms of usability. However, in terms of implementation it's not an easy feat, esp with Gradio.

I'm working on my custom standalone interface (A8R8) with a forked version of the extension to add region masks on a unified canvas.

Here's my experimentation with it so far: https://www.reddit.com/r/StableDiffusion/comments/1btrf4p/part_2_experimenting_with_regional_prompting_with/

Haoming02 commented 5 months ago

Hello, please give the better-ui branch a try!

MoreColors123 commented 5 months ago

what does row index mean exactly? and: is the first row of the prompt a global effect here or is it connected to the first box? because i can't quite figure it out right now. i mean is the first drawn rectangle connected to the first line or the second line?

Haoming02 commented 5 months ago

is the first drawn rectangle connected to the first line or the second line?

Yes. 1 entry (row) corresponds to 1 chunk (prompts separated by Separator).

is the first row of the prompt a global effect

Depends on what you set for the entry. To have a global effect, simply set both x and y to 0.0:1.0.

row index

That is specifically for the Click & Drag feature, that selects the row to be edited using the mouse.

arcusmaximus commented 5 months ago

It... works, but it's still a bit jank, isn't it 😅

It's always going to be kind of jank if you stick close to Gradio, so I made a version that does everything itself with Javascript. Or should I say Typescript because weakly typed languages were a mistake

image

See https://github.com/arcusmaximus/sd-forge-couple/tree/draggable-box-ui I also made a pull request.

MoreColors123 commented 5 months ago

This is great, will test it soon and give you all more feedback. Hope to see it implemented in this or a similar way.

MoreColors123 commented 5 months ago

I ran some tests and it does well what it should - if you don't overlap the rectangles too much. but it also just doesn't work on every other seed. the ui is great in your version, @arcusmaximus, but sometimes it doesn't apply the regions at all, and i think you need to click once somewhere outside the region prompt boxes to prevent it from occurring. will come back with more feedback and tests.