LEv145 / --sd-webui-ar-plus

Select img aspect ratio from presets in sd-webui
51 stars 5 forks source link

The Calculated Dimensions are not ideal #21

Open altoiddealer opened 5 months ago

altoiddealer commented 5 months ago

Hello, thank you for this great extension.

I should probably revise this to say "the calculated dimension" in the singular, because it seems to only calculate one dimension based on the other.

Screenshot 2024-03-10 101708

Ideally, though, both dimensions would be calculated simultaneously in the same method that Stable Swarm calculates - see ResToModelFit()

The values that your extension sets right now are not the sizes that the base models and most finetunes are trained on.

In a nutshell: -The model's 'base res' (ei: 1024) is considered the median value, and the height/width are offset proportionately. -The values are rounded to the nearest multiple of 64.

I spent a bit of time implementing the identical logic in my discord bot (see update_size_option() section) for it to calculate all the different sizes for my /image** command.

Screenshot 2024-03-10 100718

They are calculated dynamically using the same calculation. However, all the models in Stable Swarm correctly include their base resolution in the metadata, so their function is always predictable and successful. The majority of models on Civitai do not include the intended base resolution in the metadata.

My script assumes the "base resolution" as determined as the current base (Height + Width /2) - whatever these user settings may be.

This is how I recommend your extension should use as the input value - the sum of whatever the current Width/Height values are divided by 2.

I recommend that you consider implementing this approach, if possible!

LEv145 commented 5 months ago

I seem to understand what you want

You want to calculate dimensions values dynamically based on the "base resolution" of the model, not the ratios This is a useful thing, but I do not know how to build it better The meaning of this extension is to conveniently calculate the second coordinate for the size without a calculator, and not to adjust it to the dimensions convenient for the model

But you can help implement this feature. But we need to decide exactly how to add this functionality to the UI so that it does not interfere with the previous one

altoiddealer commented 5 months ago

What I was suggesting, is that by pressing one of the Ratio buttons, the following would happen:

-Whatever the current Height and Width values are in the window, would get averaged together (W+H) /2

-The new Width and Height would be set based on the calculation I shared.

For instance, if the current width is 1024 and height is 1024, then pressing 4:3 would result in: 1152 x 896

Then pressing 1:1 would give you 1024 x 1024. And so on (favorable values).

Anyone using favorable dimensions initially, will get favorable dimensions after pressing the button.

Currently, anyone pressing the buttons will, more times then not, get unfavorable dimensions - all they will get is a correct aspect ratio and nothing more.

LEv145 commented 5 months ago

Such a feature is useful

But still it is a new feature that we need to figure out how to implement without replacing the old ones. It's probably much easier to just have a fixed list of sizes in resolutions.txt and use it

You can share your own resolutions.txt here: https://github.com/LEv145/--sd-webui-ar-plus/discussions

altoiddealer commented 5 months ago

I digress - the current Ratios feature should be replaced. The users clicking on the current Ratio buttons blissfully believe that the returned resolutions are ideal values. I may do the community (and myself) a service by forking, fixing, and mentioning this issue to A1111/Forge because image resolution is quite fundamental for image generation

LEv145 commented 5 months ago

I don't think users expect this behavior from an extension and will be happy with such a fundamental change The extension simply wasn't created for this

LEv145 commented 5 months ago

This change is useful, but it's worth adding it to the current functionality, not replacing it

altoiddealer commented 5 months ago

Heya - so, it does not work quite as elegantly as I had assumed.

What I had not considered was how the "mean" value of the width/height gets skewed due to the rounding to multiples of 64.

-This method does successfully apply the ideal resolution on the first click - assuming it is starting from 1:1.

-It will go back to the original resolution when pressing a "1:1" button.

-However, since the values are rounded to the nearest multiple of 64, the mean value changes, so it will not work perfectly when changing from resolutions like 2:3 to 3:4.

-Additionally, this will not work for values like "1:85:1" unless that can somehow be converted to a standard value...

Here is the py code if you want to consider adding this as either a primary or secondary method. Note that sqrt needs to be imported from math, and also that this method needs the 'label' (ei: 2/3) rather than the 'ar' (ei. 1.333)

from math import gcd, sqrt

def round_to_precision(val, prec):
    return round(val / prec) * prec

def res_to_model_fit(w, h, mp_target):
    mp = w * h
    scale = sqrt(mp_target / mp)
    w = int(round_to_precision(w * scale, 64))
    h = int(round_to_precision(h * scale, 64))
    return w, h

def dims_from_ar(avg, ar):
    mp_target = avg*avg
    doubleavg = avg*2

    ar_parts = tuple(map(int, ar.replace(':', '/').split('/')))
    ar_sum = ar_parts[0]+ar_parts[1]
    # calculate width and height by factoring average with aspect ratio
    w = round((ar_parts[0]/ar_sum)*doubleavg)
    h = round((ar_parts[1]/ar_sum)*doubleavg)
    # Round to correct megapixel precision
    w, h = res_to_model_fit(w, h, mp_target)
    return w, h

class ARButton_v2(ToolButton):
    def __init__(self, ar='1/1', **kwargs):
        super().__init__(**kwargs)

        self.ar = ar

    def apply(self, w, h):
        avg = (w + h) // 2
        if (w + h) % 2 != 0: avg += 1
        w, h = dims_from_ar(avg, self.ar)
        return list(map(round, [w, h]))

    def reset(self, w, h):
        return [self.res, self.res]

###

                # Aspect Ratio buttons
                ar_btns = [
                    ARButton_v2(ar=label, value=label)
                    for ar, label in zip(
                        self.aspect_ratios,
                        self.aspect_ratio_labels,
                    )
                ]

Edit

Actually, the ratios like 1.85:1 will work with these changes:

from fractions import Fraction
from math import gcd, sqrt

and updating the one line with this one...

ar_parts = tuple(map(Fraction, ar.split(':')))

xhoxye commented 5 months ago

Maybe that's what you want? QQ截图20240314214346

altoiddealer commented 5 months ago

More like this actually

altoiddealer/--sd-webui-ar-plusplus

Screenshot 2024-03-13 111328

20/20 handsight, should’ve called it “plus-minus”

Next commit will be replacing the "Reverse Logic" button with a "Lock Average" button

xhoxye commented 5 months ago

Your style can display shorter text and fit the UI better, but be aware that your formula (W+H) /2 may be problematic, especially since SDXL uses Positional Encoding, which has a specified size.

https://github.com/Stability-AI/generative-models/blob/477d8b9a7730d9b2e92b326a770c0420d00308c9/scripts/demo/sampling.py#L7

https://github.com/lllyasviel/Fooocus/discussions/117#discussioncomment-7364981

Take a look at the Aspect Ratio Bucket https://civitai.com/articles/2056#heading-425

https://github.com/NovelAI/novelai-aspect-ratio-bucketing https://gigazine.net/gsc_news/en/20221108-novelai-aspect-ratio-bucketing/

altoiddealer commented 5 months ago

Your style can display shorter text and fit the UI better, but be aware that your formula (W+H) /2 may be problematic, especially since SDXL uses Positional Encoding, which has a specified size.

https://github.com/Stability-AI/generative-models/blob/477d8b9a7730d9b2e92b326a770c0420d00308c9/scripts/demo/sampling.py#L7

lllyasviel/Fooocus#117 (comment)

Take a look at the Aspect Ratio Bucket https://civitai.com/articles/2056#heading-425

https://github.com/NovelAI/novelai-aspect-ratio-bucketing https://gigazine.net/gsc_news/en/20221108-novelai-aspect-ratio-bucketing/

The values of this fork (ar-plus) are rounded to the nearest increment of 4px by default, which is no where near those bucket resolutions.

My fork (ar-plusplus) takes any value and provides the closest values possible resolution to the target aspect ratio, while maintaining 64px precision. The (W+H/2) simply assumes what that input value should be. I sought advice on Stabilitys discord and a few very knowledgeable folks showed me the way, provided the calculations which yielded exactly the same values that their public img gen bots use

altoiddealer commented 5 months ago

Reporting that I finished up my fork of this project not too long ago.

https://github.com/altoiddealer/--sd-webui-ar-plusplus

LEv145 commented 5 months ago

As I see it, you left the previous functionality of --sd-webui-ar-plus, perhaps some changes can be implemented in the --sd-webui-ar-plus itself

altoiddealer commented 5 months ago

I think this was the only real bug I fixed in the process…

I think a few of the named .js titles were mislabeled, too. If I remember any other bugs, I’ll let you know.