huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.43k stars 5.27k forks source link

Image output tiling for seamless textures with Stable Diffusion #556

Open torrinworx opened 2 years ago

torrinworx commented 2 years ago

Is your feature request related to a problem? Please describe. Currently there is no way to create seamless textures with Stable Diffusion, a crucial feature that is missing.

Describe the solution you'd like Something similar to this pull on the sd-webui repo: https://github.com/sd-webui/stable-diffusion-webui/pull/911

A simple argument in the StableDiffusionPipeline that would enable seamless texture generation for 3D applications.

anton-l commented 2 years ago

Hi @torrinworx! As the https://github.com/sd-webui/stable-diffusion-webui/pull/911 PR suggests, you can make the Stable Diffusion models tile-able by patching the torch.nn.Conv2d before loading the pipeline:

# add global options to models
def patch_conv(**patch):
    cls = torch.nn.Conv2d
    init = cls.__init__
    def __init__(self, *args, **kwargs):
        return init(self, *args, **kwargs, **patch)
    cls.__init__ = __init__

patch_conv(padding_mode='circular')
print("patched for tiling")

Native support for tiling in diffusers is unlikely to come, as it would either be hacky, or complicate the model the model design with additional arguments :)

shirayu commented 2 years ago

FYI: In my app, I toggle to be tile-able or not after pipe loading by change padding_mode in each layer.

  1. First, save conv layers and their original padding modes
  2. Change padding mode on each generation
github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten commented 1 year ago

Could we make this a community pipeline maybe? :-)

camenduru commented 1 year ago

is it possible for flax maybe like this: https://flax.readthedocs.io/en/latest/api_reference/_autosummary/flax.linen.Conv.html?highlight=circular

update: I test it working 🎉 3

uyo9ko commented 1 year ago

Hi @torrinworx! As the sd-webui/stable-diffusion-webui#911 PR suggests, you can make the Stable Diffusion models tile-able by patching the torch.nn.Conv2d before loading the pipeline:

# add global options to models
def patch_conv(**patch):
    cls = torch.nn.Conv2d
    init = cls.__init__
    def __init__(self, *args, **kwargs):
        return init(self, *args, **kwargs, **patch)
    cls.__init__ = __init__

patch_conv(padding_mode='circular')
print("patched for tiling")

Native support for tiling in diffusers is unlikely to come, as it would either be hacky, or complicate the model the model design with additional arguments :)

hello, i follow your advice , add this snippet before loading the pipeline, but there is an error when loading pipeline:

pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    model_path,
    revision="fp16", 
    torch_dtype=torch.float16,
    use_auth_token=True
)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-29-796abcd4f547>](https://localhost:8080/#) in <module>
      6     revision="fp16",
      7     torch_dtype=torch.float16,
----> 8     use_auth_token=True
      9 )
     10 pipe = pipe.to(device)

8 frames
[<ipython-input-19-17c9e93293f0>](https://localhost:8080/#) in __init__(self, *args, **kwargs)
      4     init = cls.__init__
      5     def __init__(self, *args, **kwargs):
----> 6         return init(self, *args, **kwargs, **patch)
      7     cls.__init__ = __init__
      8 

TypeError: __init__() got multiple values for keyword argument 'padding_mode'

can you tell me why?

patrickvonplaten commented 1 year ago

@anton-l, I think we can allow to adapt the conv mode with a nice diffusers API here no?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten commented 1 year ago

Linking some first PRs here:

patrickvonplaten commented 1 year ago

cc @patil-suraj @anton-l does any of you have time to look into tiling?

keturn commented 1 year ago

Note this must be applied to both the diffusion model and the decoder (VAE).

patil-suraj commented 1 year ago

That's the plan @keturn, still experimenting with it.

keturn commented 1 year ago

begone, stalebot!

patrickvonplaten commented 1 year ago

Ok it does not look like @patil-suraj you'll have time for this anytime soon no? Is someone else maybe interested in picking this one up (might be a bit difficult as a first PR though): @pcuenca @williamberman @yiyixuxu maybe?

camenduru commented 1 year ago

up

patil-suraj commented 1 year ago

I don't have much bandwidth for it this week, but I will try to look into https://github.com/huggingface/diffusers/pull/1521 next week

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

keturn commented 1 year ago

it's still an open issue, stalebot

AmanKishore commented 1 year ago

This seems like an interesting add-on to the WebUI: https://github.com/tjm35/asymmetric-tiling-sd-webui Any insight on how to add X and Y padding into the current diffusers implementation?

patrickvonplaten commented 1 year ago

Since we have this: https://github.com/huggingface/diffusers/pull/1441 added now think we can close this one no?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

jtoy commented 1 year ago

I’m interested in this as well. Is there at least a workable hack? Then we can look into making into usable code.

shamelesslyAI commented 1 year ago

I’m interested in this as well. Is there at least a workable hack? Then we can look into making into usable code.

Implementation in colab notebook: (Inspect the setup cell and copy paste) https://colab.research.google.com/drive/1E5Fa2Tu04g3kb443WnrhbWNhoMzcijoj?authuser=3#scrollTo=Z_pZ9zpFJvbY

psychedelicious commented 1 year ago

@patrickvonplaten I think this one was closed prematurely - this is to create seamless, tiling outputs - not tiled VAE decode. Can we re-open?

patrickvonplaten commented 1 year ago

Sure!

pietrobolcato commented 1 year ago

this is to create seamless, tiling outputs

Would also be super interested in this one!

icech commented 1 year ago

is there any progress about creating seamless, tiling outputs in diffusers? @patrickvonplaten

patrickvonplaten commented 1 year ago

Are there any pointers of a working implementation?

jtoy commented 1 year ago

Automatic1111 had has it for a long time. — Sent from my mobileOn Aug 4, 2023, at 5:47 AM, Patrick von Platen @.***> wrote: Are there any pointers of a working implementation?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

icech commented 1 year ago

Are there any pointers of a working implementation?

I want to know if it's possible to implement a tiled mode control flag, just like enable_vae_tiled(), so that everyone can use the tiled mode more conveniently. Shirayu provided a feasible approach, but it's not elegant enough. Link to the discussion

Mystfit commented 1 year ago

I had the first implementation (https://github.com/huggingface/diffusers/issues/556#issuecomment-1253772217) suggested in this thread working for quite a while but I've noticed that it doesn't seem to work any more with newer diffusers versions and torch 2.0+ - not quite sure when it stopped working.

I've also tried some alternative implementations like how InvokeAI handles seamless textures by patching the unet, text_encoder and vae and also the alternative suggestion in this thread https://github.com/huggingface/diffusers/issues/556#issuecomment-1253974962 but I'm still getting images like the following.

image

MikeHanKK commented 1 year ago

it seems the diffusers 0.19 version causing the problem. diffusers 0.18 is fine. I am having the same issue. Torch is still the same version. dunno what has changed in diffusers.

jiuzixue09 commented 1 year ago

I had the first implementation (#556 (comment)) suggested in this thread working for quite a while but I've noticed that it doesn't seem to work any more with newer diffusers versions and torch 2.0+ - not quite sure when it stopped working.

I've also tried some alternative implementations like how InvokeAI handles seamless textures by patching the unet, text_encoder and vae and also the alternative suggestion in this thread #556 (comment) but I'm still getting images like the following.

image

same problem!!!

alexisrolland commented 1 year ago

I confirm there is a regression in diffusers 0.19.3. I have tried the solution provided by Anton in the first comment and it works well with 0.18.0.

Example:

texture

It tiles perfectly:

tiled

The same prompt, seed and resolution with 0.19.3 does not tile:

texture_no_tiling

It would be really nice to have a native feature in diffusers to enable seamless tiling. In particular one that would allow to do it without reloading the pipeline.

jiuzixue09 commented 1 year ago

I had the first implementation (#556 (comment)) suggested in this thread working for quite a while but I've noticed that it doesn't seem to work any more with newer diffusers versions and torch 2.0+ - not quite sure when it stopped working. I've also tried some alternative implementations like how InvokeAI handles seamless textures by patching the unet, text_encoder and vae and also the alternative suggestion in this thread #556 (comment) but I'm still getting images like the following. image

same problem!!!

find a solution to this problem. image replace this code return F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups) to return super().forward(x)

alexisrolland commented 1 year ago

@jiuzixue09 maybe do a PR to fix it if you are sure about this? :)

Mystfit commented 1 year ago

I had the first implementation (#556 (comment)) suggested in this thread working for quite a while but I've noticed that it doesn't seem to work any more with newer diffusers versions and torch 2.0+ - not quite sure when it stopped working. I've also tried some alternative implementations like how InvokeAI handles seamless textures by patching the unet, text_encoder and vae and also the alternative suggestion in this thread #556 (comment) but I'm still getting images like the following. image

same problem!!!

find a solution to this problem. image replace this code return F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups) to return super().forward(x)

This didn't seem to work. This is is the error that was returned from SDXL.

[2023.08.15-01.46.52:745][238]LogPython: Error: Exception in thread ImageThread:
[2023.08.15-01.46.52:756][238]LogPython: Error: Traceback (most recent call last):
[2023.08.15-01.46.52:767][239]LogPython: Error:   File "F:\UnrealProjects/StableDiffusion51/Plugins/StableDiffusionTools/StableDiffusionTools/Content/Python\bridges\DiffusersBridge.py", line 102, in run
[2023.08.15-01.46.52:779][240]LogPython: Error:     self.result = self.func()
[2023.08.15-01.46.52:790][240]LogPython: Error:   File "F:\UnrealProjects/StableDiffusion51/Plugins/StableDiffusionTools/StableDiffusionTools/Content/Python\bridges\DiffusersBridge.py", line 675, in <lambda>
[2023.08.15-01.46.52:801][241]LogPython: Error:     self.executor = AbortableExecutor("ImageThread", lambda generation_args=generation_args: self.pipe(**generation_args))
[2023.08.15-01.46.52:812][242]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
[2023.08.15-01.46.52:824][242]LogPython: Error:     return func(*args, **kwargs)
[2023.08.15-01.46.52:835][243]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\diffusers\pipelines\stable_diffusion_xl\pipeline_stable_diffusion_xl.py", line 812, in __call__
[2023.08.15-01.46.52:847][244]LogPython: Error:     noise_pred = self.unet(
[2023.08.15-01.46.52:857][244]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
[2023.08.15-01.46.52:870][245]LogPython: Error:     return forward_call(*args, **kwargs)
[2023.08.15-01.46.52:880][246]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\accelerate\hooks.py", line 165, in new_forward
[2023.08.15-01.46.52:892][247]LogPython: Error:     output = old_forward(*args, **kwargs)
[2023.08.15-01.46.52:902][247]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\diffusers\models\unet_2d_condition.py", line 921, in forward
[2023.08.15-01.46.52:914][248]LogPython: Error:     sample, res_samples = downsample_block(hidden_states=sample, temb=emb)
[2023.08.15-01.46.52:924][249]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
[2023.08.15-01.46.52:936][249]LogPython: Error:     return forward_call(*args, **kwargs)
[2023.08.15-01.46.52:947][250]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\diffusers\models\unet_2d_blocks.py", line 1142, in forward
[2023.08.15-01.46.52:959][251]LogPython: Error:     hidden_states = resnet(hidden_states, temb)
[2023.08.15-01.46.52:969][251]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
[2023.08.15-01.46.52:981][252]LogPython: Error:     return forward_call(*args, **kwargs)
[2023.08.15-01.46.52:991][253]LogPython: Error:   File "F:\UnrealProjects\StableDiffusion51\Plugins\StableDiffusionTools\StableDiffusionTools\FrozenPythonDependencies\Lib\site-packages\diffusers\models\resnet.py", line 620, in forward
[2023.08.15-01.46.53:004][253]LogPython: Error:     hidden_states = hidden_states + temb
[2023.08.15-01.46.53:014][254]LogPython: Error: TypeError: unsupported operand type(s) for +: 'NoneType' and 'Tensor'
alexisrolland commented 1 year ago

Hello, is it please possible to know if this feature is planned? Hoping in the next release? Thanks!

cmdr2 commented 1 year ago

Yep, this changed with the recent change in diffusers 0.19.2 - https://github.com/huggingface/diffusers/commit/b1e52794a28a66b575943fdfeca16809b605cc3c#diff-528ec2d3100f053b2b75ba808cb0e8b983545b721b1c42dc1fd47d76522d7faeR130

This workaround works for me. Change the last for-loop from https://github.com/huggingface/diffusers/issues/2633#issuecomment-1676629872 to:

for cl in conv_layers:
    if isinstance(cl, diffusers.models.lora.LoRACompatibleConv) and cl.lora_layer is None:
        cl.lora_layer = lambda *x: 0

    cl._conv_forward = asymmetricConv2DConvForward.__get__(cl, torch.nn.Conv2d)

This forces it to use the _conv_forward wrapper when a lora isn't set.

This workaround will work with LoRAs as well.

As a tip, disable VAE Tiling before using this. That is, set pipe.vae.use_tiling to False before running this code.

alexisrolland commented 1 year ago

Hello @cmdr2

I'm looking at your code snippet and I think I'm lacking a bit of context, I have no idea where to place it / how to use it. Could you please elaborate a bit?

Questions that come up when I'm looking at your code:

cmdr2 commented 1 year ago

@alexisrolland I basically took the code mentioned in https://github.com/huggingface/diffusers/issues/2633#issuecomment-1676629872 and changed the last two lines.

github-actions[bot] commented 11 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

DJviolin commented 11 months ago

Not so stale...

github-actions[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

psychedelicious commented 10 months ago

Not stale pls

pcuenca commented 10 months ago

This seems to be a popular used feature. Is there anyone working on a PR?

weberhen commented 10 months ago

Hi!

Also, current proposed implementations here does not seem to work with the latest diffusers (changed from diffusers==0.21.4 to 0.24.0 and it does not work anymore).

patrickvonplaten commented 10 months ago

Hey @weberhen could you maybe open a new issue with a reproducible code snippet?

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

psychedelicious commented 9 months ago

Bump stalebot