pkuliyi2015 / multidiffusion-upscaler-for-automatic1111

Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0
Other
4.73k stars 334 forks source link

This extension breakes seamless tiling #197

Open GitHub1712 opened 1 year ago

GitHub1712 commented 1 year ago

As you know we can create seamless tiles by default with Stable diffusion webui by checking tiling:

Unbenannt-1 Using multidiffusion-upscaler partitialy breakes this feature. Here a screen of not perfect matching borders with tiling and multidiffusion-upscaler enabled:

222

Is there any way to fix this? Seems like the decoding ignores the tiling because it is not overlapping the opposite edges.

pkuliyi2015 commented 1 year ago

Hello, I seldom use the tiling function. How can I get your example image? Would you mind providing me the corresponding params?

GitHub1712 commented 1 year ago

You can create such image on text2image tab on any sd default model with prompt "brick pattern" and tiling enabled. You ceck the tiling easy with https://www.pycheung.com/checker/ by example. You should see perfect tiling without seams. If you create bigger images with multidiffusion tiling active, you will see results with visible not perfect seams like the picture I posted. This is just a small part of 2 repeated images to show the broken seam. It is on all images generated with tiled vae and tiling active. The main diffusion respects the tiling, it is not complete broken but just from the vae encoding last step half broken, the seams are almost repeats but far from perfect as without tiled vae.

AugmentedRealityCat commented 1 year ago

I confirm that I have observed that problem and that I haven't found any solution to make seamless tiling work as it should.

The Asymmetric Tiling extension, which is very similar to the standard tiling feature but with the option to tile only on the X and Y axis instead of both, is also suffering from the exact same problem.

Like @GitHub1712 , I have observed that the main diffusion function still seems to work well with tiling and asymmetric tiling, and that it's the last steps with the VAE that is somehow messing things up and preventing the tiling from being seamless.

To make it work properly, the Tiled VAE process would have to actually repeat some of the tiles it is using, so that the last row on the right connect with copies of the tiles from the first row on the left, and vice versa, as well as for the top and bottom rows, if you are tiling on both the X and Y axis.

If you think this might be useful, I can make a proof of concept by manually extending an existing image in such a way before sending it to Tiled VAE and check if it's tiling properly if I cut those extra rows after the process.

GitHub1712 commented 1 year ago

That would be great, any solution to fix tiling would help.

AugmentedRealityCat commented 1 year ago

That would be great, any solution to fix tiling would help.

Ask and you shall receive ! So the little trick I had in mind works perfectly, but it requires extra canvas space, which means more VRAM. I hope someone with proper coding skill will be able to replicate what I'm doing here "by hand".

First we will generate a lower resolution image in 1024x1024 with the asymmetric tiling extension set to tile on both the X and the Y axis. I'm using the popular Photon v1 model for this demonstration, which is based on model 1.5, but this technique should work with any model once transformed as code - but for now it works better with the Tile ControlNet model, and this one only exists for model 1.5 checkpoints..

So here is the prompt, followed by the resulting image in 1024x1024. Steps: 68, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 85340, Size: 1024x1024, Model hash: ec41bd2a82, Model: photon_v1, Tile X: True, Tile Y: True, Start Tiling From Step: 0, Stop Tiling After Step: -1, SGM noise multiplier: True, Version: v1.6.0-127-g102b6617

1024_tiling_base

After generating the 1024x1024 image above, which is tiling perfectly as seen below... Screenshot 2023-09-28 at 05-09-04 Stable Diffusion

... we will move to photoshop to create a 128 pixel wide seamless border all around our pixture, which will become a 1280x1280 picture. In my case, I copy-pasted the 1024x1024 image in photoshop 9 times - one in the center, and the other 8 all around, which will only be partly visible. This gives you something like this with just the first central image: breadx9_1280_center_tile_only

And something like this when you add 8 images all around, for a total of 9 images covering the whole 1280x1280 canvas. breadx9_1280

What we're doing here is that we are pre-tiling the border of our image so that Tiled VAE can see what is beyond the edge of the picture and keep its seamless tiling properties after upscaling.

The next step will be to apply the prompt parameters used to generate your 1024x1024 image and to bring that 1280x1280 image we just built (from 9 identical and seamlessly tiling 1024x1024 images) into Stable Diffusion, but in Image2Image mode this time. After selecting this 1280x1280 image as a source, set the output resolution to twice our source's - here that would be 2 X (1280x1280), so 2560x2560.

To guide the upscaling process we will also apply controlNet here, in Tile mode.

Finally, since this will be too demanding even for a 4090 if we run it "raw" like this, we will fire up our beloved Tiled VAE extension. In addition to the checkbox to turn it on, make sure you set it to a value that works for your system. I might be overly cautious, but I use 1024 for the encoder and 96 for the decoder and it has been working well so far.

We press generate and we obtain something like this:

2048_tiling_properly_from_2560crop

But this 2560x2560 is larger than what we actually want - it has this extra border, and this border has also been upscaled. So now we have to remove it. We load the large 2560x2560 image in photoshop and we center-crop it into a 2048x2048 image instead, which is exactly twice the resolution of the original (and perfectly tiling) source image we have been working from. And we get this (here with a black border to indicate the cropped-out area):

2048_tiling_properly_from_2560crop_on_black_canvas

And this 2048x2048 picture, once tiled, looks like this - perfectly seamless, exactly like we wanted !

2048x9_tile_proof

Now, the big question that remains is how can we turn this little hack into code that will make this work automatically ? Because doing it by hand like in the example above, while it does work, is rather time-consuming.

@pkuliyi2015 is there anything else I can do to help you implement this feature that would fix the incompatibility between Tiled VAE and seamless tiling ?

@GitHub1712 , does this solution help you ?

pkuliyi2015 commented 10 months ago

Thank you for bringing this problem into my attention. I'm thinking about why this problem will happen but I need to first look into the implementation of webui's tiling feature. I will fix this in my spare time!

AugmentedRealityCat commented 10 months ago

Here is some more information that might help you.

The same problem also happens with the asymmetric tiling extension. This is the extension I'm using for creating panoramic content.

https://github.com/tjm35/asymmetric-tiling-sd-webui

To create the extra image canvas area around the main image you need to add 8 instances of your initial tile to make it tile both on the X and Y axis.

But you only need 2 instances to make it tile on a sinlge axis (one on the left side, and one on the right side), so it might make the development process easier to manage if you support a single tiling axis first, and then extend that solution to the other axis once it has been proven to work.

Thanks a lot for looking at this in your spare time !

philz1337x commented 3 months ago

That would be great, any solution to fix tiling would help.

Ask and you shall receive ! So the little trick I had in mind works perfectly, but it requires extra canvas space, which means more VRAM. I hope someone with proper coding skill will be able to replicate what I'm doing here "by hand".

First we will generate a lower resolution image in 1024x1024 with the asymmetric tiling extension set to tile on both the X and the Y axis. I'm using the popular Photon v1 model for this demonstration, which is based on model 1.5, but this technique should work with any model once transformed as code - but for now it works better with the Tile ControlNet model, and this one only exists for model 1.5 checkpoints..

So here is the prompt, followed by the resulting image in 1024x1024. Steps: 68, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 85340, Size: 1024x1024, Model hash: ec41bd2a82, Model: photon_v1, Tile X: True, Tile Y: True, Start Tiling From Step: 0, Stop Tiling After Step: -1, SGM noise multiplier: True, Version: v1.6.0-127-g102b6617

1024_tiling_base

After generating the 1024x1024 image above, which is tiling perfectly as seen below... Screenshot 2023-09-28 at 05-09-04 Stable Diffusion

... we will move to photoshop to create a 128 pixel wide seamless border all around our pixture, which will become a 1280x1280 picture. In my case, I copy-pasted the 1024x1024 image in photoshop 9 times - one in the center, and the other 8 all around, which will only be partly visible. This gives you something like this with just the first central image: breadx9_1280_center_tile_only

And something like this when you add 8 images all around, for a total of 9 images covering the whole 1280x1280 canvas. breadx9_1280

What we're doing here is that we are pre-tiling the border of our image so that Tiled VAE can see what is beyond the edge of the picture and keep its seamless tiling properties after upscaling.

The next step will be to apply the prompt parameters used to generate your 1024x1024 image and to bring that 1280x1280 image we just built (from 9 identical and seamlessly tiling 1024x1024 images) into Stable Diffusion, but in Image2Image mode this time. After selecting this 1280x1280 image as a source, set the output resolution to twice our source's - here that would be 2 X (1280x1280), so 2560x2560.

To guide the upscaling process we will also apply controlNet here, in Tile mode.

Finally, since this will be too demanding even for a 4090 if we run it "raw" like this, we will fire up our beloved Tiled VAE extension. In addition to the checkbox to turn it on, make sure you set it to a value that works for your system. I might be overly cautious, but I use 1024 for the encoder and 96 for the decoder and it has been working well so far.

We press generate and we obtain something like this:

2048_tiling_properly_from_2560crop

But this 2560x2560 is larger than what we actually want - it has this extra border, and this border has also been upscaled. So now we have to remove it. We load the large 2560x2560 image in photoshop and we center-crop it into a 2048x2048 image instead, which is exactly twice the resolution of the original (and perfectly tiling) source image we have been working from. And we get this (here with a black border to indicate the cropped-out area):

2048_tiling_properly_from_2560crop_on_black_canvas

And this 2048x2048 picture, once tiled, looks like this - perfectly seamless, exactly like we wanted !

2048x9_tile_proof

Now, the big question that remains is how can we turn this little hack into code that will make this work automatically ? Because doing it by hand like in the example above, while it does work, is rather time-consuming.

@pkuliyi2015 is there anything else I can do to help you implement this feature that would fix the incompatibility between Tiled VAE and seamless tiling ?

@GitHub1712 , does this solution help you ?

Hi @AugmentedRealityCat, we are trying to make this work as a script, but I am struggling at the moment to make it run as a proof of concept.

I took your extended canvas image and upscaled it, but the image I get is not seamless. Am I missing some settings? Could you post your entire parameters maybe?

Would you be available to hop on a Telegram or Twitter/X DM chat? My X is: https://x.com/philz1337x

your image as an input: 271234090-8347e912-0490-40f8-948e-61b419ff2a94-2

upscaled output: replicate-prediction-k2vfc0j759rgm0cg3dttg20aag

GitHub1712 commented 3 months ago

Hi all, thank you! Workarounds like half res generation with extending before upscaling are leading to different, eventually similar results and are too complicated. The solution could be, to extend the latent which is perfect repeat before passing to extended vae tiled vae.