Acly / krita-ai-diffusion

Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.
GNU General Public License v3.0
5.55k stars 245 forks source link

Fix issue 838: Make the Strength spinbox snap to full steps, and show steps. #858

Closed FeepingCreature closed 3 days ago

FeepingCreature commented 1 week ago

Fixes https://github.com/Acly/krita-ai-diffusion/issues/838

You can still enter percent, but scroll/inc/dec actions will snap to the next full stepsize. That is, if a style with 20 steps is selected, it will snap to 5, 10, 15... etc. Changing the style does not change the percent (but does change the suffix ofc). Only a deliberate input makes any change to the actual value. Screenshot_20240622_005030

Significant credit to Claude 3.5 Sonnet to pointing me in the right direction. If I just remove a few files (websocket and two more), I can actually feed it the whole remaining Krita AI Diffusion source, basically fill up half its context window, and still only pay 30c per query.

FeepingCreature commented 1 week ago

Hm, how do I ping an issue? Does it have to be in the commit description? edit: got it

Acly commented 1 week ago

While I like the idea of snap to full steps...

... I'm afraid it gets more complicated: there is a "minimum step count" (typically 4, but custom sampler presets can overwrite it). If strength% is such that resulting step count would be below the minimum, total step count is rescaled to allow matching the strength% while executing the minimum steps.

Example: total steps = 8, min steps = 4

https://github.com/Acly/krita-ai-diffusion/blob/main/ai_diffusion/workflow.py#L52

FeepingCreature commented 1 week ago

Oooh. Wow, yeah that is evil. So if we go below 4, the total steps invisibly scale up to keep the minimum step number at 4? Does this also apply if we've configured less steps than 4 total?

Acly commented 1 week ago

With total steps less than minimum steps, the number of steps executed will always equal total steps.

FeepingCreature commented 1 week ago

That seems doable. Gonna take a look in the afternoon.

arcusmaximus commented 1 week ago

For me personally, "lying" to the user (e.g. showing their selected 60% in the history even though the plugin used 62% under the hood) would honestly be fine. Snapping wise, I can imagine users would want to keep the ability to select a nice round number (purely psychologically). And showing the step count is of course useful to explain why the rounding happens, but at the same time, you could say it clutters the UI with technical details which the user doesn't really need to know about.

In any case, thanks for looking into this even though it's something so minor :)

FeepingCreature commented 1 week ago

I mean even just for myself, I like my inputs to have a meaningful effect. Why offer a UI if the elements you control only have a tangential relationship to the eventual effect? This is less "I should fix this bug" and more "oh so THAT'S why sometimes weird stuff happens that makes no sense!" A UI should not gaslight its user, but that's only a side effect of the idea that the model of the UI should represent the reality of the domain that it operates in. Percentages are a lie, albeit a useful lie; steps are what's real.

arcusmaximus commented 1 week ago

You have a point, of course, but it's also useful to keep the target audience in mind - and this plugin is clearly more focused on artists who just want to make pictures, rather than technical users who want to know how everything works.

Whereas other UIs show the step count very prominently, this one hides it away in a collapsed panel somewhere in the settings. Same thing for the inpaint resolution, or the choice of whether to use an inpaint model: other UIs make you configure those manually, but this one does it all behind the scenes without telling you anything about it.

If the overarching trend is to expose less technical details than other UIs, why expose more details here? I've used A1111, Comfy, Swarm and Invoke, and none of those told me my denoise strength is getting rounded - and it was fine, because the difference was so small that it didn't matter.

FeepingCreature commented 1 week ago

All abstractions leak. I happen to think this one leaks quite badly, as per the original bug report. It's still very much useful to have percentage, of course, but as I straight up didn't realize this corresponded to steps (obvious in hindsight), this actually clears up a lot of things that confused me at the time: like how sometimes I could change the strength in live mode and not trigger any recomputation. (Because I didn't cross a step limit...) So I do like the simplification, but I don't think it's any less simplified if it actually informs me what it's doing behind the scenes, especially since the percentage input, honestly, writes checks that the backend cannot cash.

arcusmaximus commented 1 week ago

sometimes I could change the strength in live mode and not trigger any recomputation

So have it trigger on every strength change, easy.

I don't think it's any less simplified if it actually informs me what it's doing behind the scenes

Having one number is quicker to read for users than three (as above, no need to clutter the UI with information that not even Comfy shows you).

Anyway, we've both made our points, so let's have Acly decide which way to go.

Acly commented 1 week ago

I view strength% as a way to indicate "this is how much I would like the image to change". I think it's okay for the implementation to then figure out the actual step count that best matches the intent. Knowing the actual step count and denoise% might be interesting to some, but I don't see how it changes what you might do with it.

Strength% is a high-level abstraction, it is not meant to accuractly reflect how much noise is added, there is no "lying".

I'd also like to avoid having two numbers in the UI which report the same thing. Steps are less intuitive and require additional explanation for those not familiar with diffusion process.

The original bug report is easily fixed in that spirit, by showing in the history the strength% that was used to generate the result. Showing generation parameters is what the history is meant to do, that it currently exposes an implementation detail is by accident.

like how sometimes I could change the strength in live mode and not trigger any recomputation

This is the point that bothers me, and why I support snapping (or at least think it's worth a try). It has confused users before, and even though I'm aware of what's happening I often wish there would be a way to go the next meaningful strength.

So have it trigger on every strength change, easy.

If you don't cross a step threshold the image will be the same. Though if there was a way to smoothly add noise independent of the step count, that might be the best solution,

FeepingCreature commented 1 week ago

Alright, sounds sensible. Is it okay if I keep the layer display behind an advanced setting, like negative prompt? I'm already getting used to it.

FeepingCreature commented 1 week ago

Alright, scrolling now works correctly below min_steps. I rewrote the code; it's a bit longer but hopefully easier to follow. It now precomputes a list of spinbox values with unique step state and just continues to the next one on scroll input. This also fixes some bugs in the computation that were previously in there; the code is now more "logical" rather than a mess of math. I've left the labels in pending a reply regarding hidden setting or if I should take it out entirely.

Acly commented 1 week ago

I don't mind the setting as such, but it's currently quite a lot of code and connections for what it does. It would be nice if it could compute the next (or previous) step locally from current data directly in stepBy. That should work for snapping. But you'd need all the events for the step display - including some that are still missing I think, like changing sampler preset or total steps in style settings. Feels like it might not be worth it.

Let's remember the goal of snapping: to provide the next value with "noticeable change". This doesn't necessarily mean the next technically smallest value. There is a reason it was fixed to 5 or 10, not 1 - going from 28% to 26% isn't really useful, even if it does produce a tiny bit different image. That's why I think it can be simplified a lot by using flat 5 if below min_steps, and only computing next threshold on the fly otherwise.

I do like the way it's computed now because it allows reusing the same function that will then be used for the workflow, instead of having to invert the math. You can just start at current value and increment (or decrement) in a loop until you hit a threshold though, no need to precompute the array.

FeepingCreature commented 1 week ago

I did notice the changing settings issue. I honestly don't even know how to hook that up.

If there's a helper function that calculates the current steps, max_steps, that should work for both stepping and displaying "for free". I'll try that this afternoon.

FeepingCreature commented 6 days ago

Done, stepBy now steps until it hits a change in ... steps. Also the weight display is hidden behind an interface setting, Show Steps. Lmk what else is needed.

edit: Oh right, flat 5 step at minimum. Hm, honestly I'm not sure I agree with that... when the display is active, it would be weird for mouse wheel to go from (14/30) to (12/30). With 30 steps already the "reach the next step" stepsize is below 5, and it would be weird to have it go up to 5 below min_steps. Maybe rather have the whole feature only active if step display is on? Since that signals "I am interested in steps rather than percent."

edit: Also you're right, it looks a lot better by reusing the computation. Hm, should I use _apply_strength outright? The issue I have is that, by using round, the cutoff for say 15/30 is before 50%, and having it at 50% looks a lot neater. That's why I have that pretty flag now.

FeepingCreature commented 6 days ago

I just noticed this is only for spin box and I always use the slider :)

Can snap be done for slider too? Maybe by applying the logic in valueChanged event, I think it might just work.

Absolutely, but I didn't want to overload the PR. If it's fine, I'll have a look later.

FeepingCreature commented 5 days ago

Aigh, slider should snap now.

Originally I tried to override onMouseMove/onMouseClick, but that was too slow to be practical with pyqt because it triggers per mouse event. That's why it's attached to the value change instead, which I can do with QSlider because as opposed to the strength field there's no need to worry about manual inputs.

That's also why I pulled StrengthSnapping out into its own class: originally that was passed to StrengthSlider as well. I can revert that now I guess..? Honestly kinda like having it separate.

FeepingCreature commented 5 days ago

I've hit an issue. Sometimes, I can get a signal loop because the value change propagates to the widgets in every workspace, and depending on liveness they can have different snap settings, which can cycle.

Hm. I guess this is a forcing case, actually, for just doing what you said and just having the model, not the is_live flag, determine the is_live state in every widget. Ie. the Generation widget is also treated as is_live if the Animation widget is active and in fast mode.

Acly commented 5 days ago

I think a safer way to break the signal loop is to do it explicitly somewhere. Instead of connecting model.strength_changed to slider.setValue directly, use a function, and avoid snapping model updates with SignalBlocker.

FeepingCreature commented 5 days ago

Yeah but I'm actually just confused again, like I'm not understanding what's happening. Looking at the code more, I don't actually think it's supposed to propagate value changes between the widgets? I have no idea where the second widget object that's seeing value changes there is coming from.

Acly commented 5 days ago

You have multiple widgets linked to the same model with bi-directional links. If the widget changes so does the model, if the model changes so do all the widgets. Usually this is broken by checking if the value actually changed, and not emitting a signal if it didn't. If you have inconsistent snapping that would cause a loop.

FeepingCreature commented 5 days ago

OH RIGHT, STRENGTH IS IN THE MODEL. Thanks.

FeepingCreature commented 5 days ago

I think I got it. It's even simpler now.

StrengthSnapping just stores the model and checks for bounds and is_live on the fly. That way every widget sees the same bounds always. Loop seems to be gone, no more weird behaviors I can see offhand.

Signal breaking still a good idea, probably, but I'd say it's now out of scope for this pr as it's no longer necessary. (I don't actually speak pyqt lol.) And honestly since it's working I'm reticient to touch the signals again.

FeepingCreature commented 5 days ago

Okay, I've been trying to apply your review to use steps/max_steps to make the center search easier.

Problem: under that definition of weight, apply_strength is unstable.

Example:

Say we have settings: min steps 3, steps 8

We start with 31%: a strength of 0.31. In apply_strength with round(5.52) this comes out to start at step 6, which violates min_steps.

So we recalc steps as 3/0.31 = 9.67, floor to 9, start at 6.

But now with 3/9, we reconstruct a strength of 0.33, which is more than we started at and no longer even violates min_steps!

That means if I try using apply_strength to get the middle of the current range, I end up with a pair that lies outside of the current range.

Acly commented 5 days ago

If it's just the floor causing issues, it could be turned into a round (will more closely approximate the desired strength anyway).

FeepingCreature commented 5 days ago

It's weirder.

I've had Claude 3.5 Sonnet whip me up a Python graph tester.

https://gist.github.com/FeepingCreature/1a1ad917a717a891ac2d280947003882

As far as I can tell from poking around, this instability happens only with steps 8, min_steps 3.

edit: Hang on that's a completely different instability. edit 2: No, it is our instability. 0.31.

edit: With round, round there's an instability in steps 8, min 4 at 0.43, which ends up at 4/9, then back at 4/8.

edit: I've had Sonnet blow this up into a proper tool for me:

https://gist.github.com/FeepingCreature/97217536585d88d371cd254e3980b551

Now there's sliders for steps, min_steps, so you can easily explore the space for different implementations of apply_strength.

edit: math.ceil does not seem to have any instabilities. Sadly, this tool does not account for the rounding of the QSpinBox. If I add that back in, I get instabilities even with math.ceil.

FeepingCreature commented 5 days ago

https://gist.github.com/FeepingCreature/9923065a35129624b27c8b033dca0575 I asked Sonnet for a separate tool to just log all instabilities. Apparently (experimentally), the set of functions with the lowest errors is round twice in apply_strength, then round again in setValue. However, it still has configurations where it isn't stable. :/

Acly commented 5 days ago

I don't think investing work to make it stable is a good idea either, it makes tweaking apply_strength too complicated.

I still think snap should just not do any snapping if steps==min_steps, I don't see any value in it

FeepingCreature commented 5 days ago

Ah, if all the instability is at or below min_steps, that would fix it, right? Yeah then it makes sense to use a flat step size below that. And it'll let me use the simpler computation for the middle instead of stepping.

Okay, I've run it through the tester and it looks like so long as steps stay limited to 100, there are no stability errors if we ignore the region below min_steps.

Gonna build in flat 5-snapping in the afternoon.

Should I snap to flat increments of 5, offset of 5 from the last value, or have no snapping at all? I guess the consistent thing would be effectively just ... wait, yeah if I turn it off entirely I get natural 5-stepping in the ... hm. I guess the closest thing is just treat stepBy as offset * 5, so that single-stepping above min_steps still works by setting a step size of 1, but below min_steps it behaves like the widget has a step size of 5.

Okay, current plan:

FeepingCreature commented 4 days ago

ALRIGHT! I feel good about this version!

Seems to match the plan as described.

edit: Did a bit more cleanup.