mcmonkeyprojects / SwarmUI

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
MIT License
1.23k stars 89 forks source link

Separate inpaint prompt from general prompt in UI and metadata #303

Open Michoko92 opened 1 week ago

Michoko92 commented 1 week ago

Feature Idea

Hi,

Thank you for this great UI. I'd have a suggestion that would definitely improve inpaint refining process. Generally, we generate an image with a global prompt, then we use the EDIT feature to inpaint and fix some details. Then we send back the edited image to init image, to run the initial prompt again on it, and get a generally refined and coherent picture. Works wonders with Flux.

However, in the process of editing the image, we currently lose the initial prompt (as we use simpler prompts for inpainting, like "detailed hands", "detailed feet", and so on). Also in the generated images, the original prompt is lost in the metadata and is replaced by shorter inpainting prompts. I think it would be great that the inpaint prompt would be separate from the regular prompt, both in the UI and the metadata. It means that when we exit the EDIT UI, the prompt would be set back to the original full prompt, instead of keeping the inpainting prompt, that is generally useless outside of the editing process. That would make the back-and-forth process between generating images and inpainting them MUCH handier, and metadata preservation would be better (as each image would have both the original full prompt, and the prompt of the specific inpainting work that has been done on this generation).

Hope I made sense. 😉 Thank you!

Other

No response

aimerib commented 1 week ago

If you're savvy with git, you can test the changes here: https://github.com/mcmonkeyprojects/SwarmUI/pull/307 Any help testing changes to validate that they work and don't break anything else would be helpful.

Michoko92 commented 1 week ago

Hi, and thank you so much for having a look at it.

I'm not super good at github, but I tried to checkout your PR locally with :

git fetch upstream pull/307/head && git checkout FETCH_HEAD

The .cs files seem to be updated, and the SwarmUI dll was recompiled. However, when I go to the Edit UI, it doesn't seem to have changed anything : the edit prompt and the regular prompt are still the same. Not sure what I'm doing wrong...

aimerib commented 1 week ago

Check the Init Image params. If you expand that section, you should have an edit prompt there. You can leave your main prompts untouched, as if you add anything to the edit prompt, it will use that instead.

Michoko92 commented 1 week ago

Oh I see, interesting approach. In a sense, since some parameters like "Mask shrink grow" are already there and are used for editing, I suppose this extra "Edit Prompt" parameter can be there too. Maybe it will be a bit confusing for a newcomer though, as the current Editing flow is not already super intuitive in my opinion. For me, it would have been simpler to keep only one prompt box: if you are in the edit UI, the prompt is the Edit prompt. And when you leave the Edit UI, the prompt is set back to the regular full prompt value in the text box.

For me, it feels pretty obvious that when you go to the Edit space, it is a different section of the UI (like the INPAINT section in A1111 for example). I think people would understand if there is a prompt specific to this section, and is switched back to the full prompt when we exit the editing space.

Anyway, this is already a good start. Thank you again for providing this solution!

aimerib commented 1 week ago

Oh yeah, those are really good points. The main reason for going this route is that it required 0 fiddling with javascript, whereas a more comprehensive solution would require more deliberation. I think that for a first pass, this might be sufficient with appropriate documentation (maybe that is a good compromise for right now, add appropriate documentation to my PR so it makes it easier for users to at least have the information available), at least while we figure the sharp edges of this workflow, and what changes in the UI would really benefit from being in the main prompt area.

For example, I've found that I need to mess with the params for the edit/init image almost every time since I'm trying to do more fine-grained control fixes on images that are already pushing the boundaries of what the sdxl models can do. Having the prompt there keeps all my necessary context in one place. What this place is doesn't really matter that much from a context point of view, so whether it is in the params area, or a new tab, or something else, doesn't matter that much.

If this PR gets approved and merged (I still need to work on the feedback provided) you could open another feature request to improve the image editor workflow, as that would be slightly different (and more involved) than just making Swarm accept a new prompt for editing images.