wcde / sd-webui-refiner

Webui Extension for integration refiner in generation process
161 stars 12 forks source link

distortions with refiner steps > 20% of total steps #18

Open lisanet opened 11 months ago

lisanet commented 11 months ago

If using the refiner with steps > 20% of the total steps, the resulting image gets distored. The distortions increase even more if the steps are increased.

This may be related to the refiner being trained with timestamps 0 - 199 (I know that the extension uses steps, not timestamps)

Using the same seed as in #14 with total steps 30 and refiner steps 6, the image looks fine. With refiner steps 7 or above, the image gets distorted. You can see this very clearly at her shoulder. Here are the two images.

00015-467582933

and with distrotions

00016-467582933

IMO the step slider should be omitted and the if condition in denoiser_callback() should be changed to alwas use 80% of total steps

if params.sampling_step > params.total_sampling_steps * 0.8 - 2):

This would also clear some misunderstandings in some youtube videos about the use of the steps slider as mentioned in #13 As these distortions don't exist in the extension prior to the fix of #14, many early users of the extension may complain, why suddenly the extension produces distorted images although they didn't change the steps setting.

I'll write a patch for this proposal.

wcde commented 11 months ago

Just 0.8 will break Restart sampler. In coming days, I will rework slider to make it impossible to break image with an incorrect value.

lisanet commented 11 months ago

well, I've already tested the patch and it works fine, no errors at all.

wcde commented 11 months ago

Restart sampler makes restarts in different places depending on total number of steps, with fixed place for turning on Refiner we have inconsistent result. But I did not do many experiments, maybe it not bad as I think.

dhwz commented 11 months ago

I don't like a fixed 80% value there are reasons you've to modify it slightly (thats why there is a changable value even in ComfyUI, even if it's default set to 0.8).

lisanet commented 11 months ago

IMO distortions which might be introduced by a wrong setting, here to many steps, should be avoided as much as possible.

dhwz commented 11 months ago

Yeah as said but sometimes there are reasons you want slightly higher or even much lower refining steps, e.g. if I do 100 steps I don't want 20 refining steps, so I agree it should be somehow limited but not fixed to 0.8. maybe a slider which only allows 0.7 to 0.9 or something like that.

lisanet commented 11 months ago

A slider with some range is an alternative to a fixed value but values above 0.8 might introduce distortions.

And while one can say, that it's up to the user, to decrease the value, I'm quite sure that most users won't know, that this refiner value might be the reason for the distortions.

Edit: as you mentioned, even now, many youtube videos misunderstand the relation between total steps and refiner steps.

dhwz commented 11 months ago

I still think better than fixing it we could still add some note to the slider telling that you may get distortions if you change it too much. But still gives some advanced users the possibility to change it (as said like in ComfyUI, I don't want users switching to to ComfyUI just because we're limiting them too much in their freedom).

lisanet commented 11 months ago

hhmm, if the refiner was trained only for denoising timestamps 0 - 199, why should an advanced user go higher?

IMO advanced users will know this limitation the developers from stability ai have used. OTOH new users to this topic will try all possible settings the ui allows, which of course is legit use. But does it make sense, to allow those critical values? What about users who switch to some other image ai because the follow some advice on youtube about 'higher is better'?

Anyway, I know how to use it, so let's see what time will tell.

dhwz commented 11 months ago

@lisanet does it really matter what stability ai wrote?

See results on your own example with 20 steps, for me the one with 30% refiner steps is even the best result (no distortions?)

20 steps, 0.9 (2 refiner steps) 20steps(2_refiner_steps_0 9) 20 steps, 0.8 (4 refiner steps) 20steps(4_refiner_steps_0 8) 20 steps, 0.7 (6 refiner steps) 20steps(6_refiner_steps_0 7) 20 steps, 0.6 (8 refiner steps) 20steps(8_refiner_steps_0 6)

here it's starting to get a bit weired but still tolerable result 20 steps, 0.5 (10 refiner steps) 20steps(10_refiner_steps_0 5)

wcde commented 11 months ago

@dhwz StabilityAI wrote about timesteps, not steps (it's around 7 steps with 20 in total). But I'm also not very happy with fixed 0.8, even with timesteps, because of Restart.

lisanet commented 11 months ago

yes, I think it matters what developers write about their software, how it works and what parameter ranges it was intentionally designed and trained with.

And about distortions. It is as in most situations. It depends on what you define as distortion. IMO at 0.7 her neck and collarbone starts looking a bit unnatural. Her cleavage too.

But as I already said, Iet's see what users out there will say. I bet that most of them will never come to the conclusion, that distortions may be introduced by this setting but think they depend on some bad training data of the base model.

About timestamps:

Yes I know and already mentioned it, that this extension uses steps, not timestamps. My proposal doesn't change that. And if the current code doesn't have any issues with Restart, than my patch also does have no one.

Why? Although step * 0.8 might be a fractional value, the comparison in the if condition is always based on steps as integer values. So, if there's no Restart issue yet, there will be no issue with the patch.