lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
8.5k stars 830 forks source link

[Bug]: Tiled Diffusion + Tiled VAE + Controlnet Tile = Error #130

Closed Vigilence closed 9 months ago

Vigilence commented 9 months ago

Checklist

What happened?

I can use this combination fine, but when I want to enlarge a previously resized image I get the following error below. If I turn off controlnet the error goes away and I can resize the image fine.

Steps to reproduce the problem

Use tiled diffusion, tiled vae (scale factor 2) and controltile xl blur with an image size of 5120x2880px.

What should have happened?

I should be able to use control net with tiled diffusion, and tiled vae to resize the image since it worked fine for the images previous resize process.

Screenshot 2024-02-08 111533 Screenshot 2024-02-08 111454

What browsers do you use to access the UI ?

Brave

Sysinfo

sysinfo-2024-02-08-16-51.json

Console logs

venv "I:\stable-diffusion-webui-forge\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f0.0.10-latest-76-g291ec743
Commit hash: 291ec743b603fdcd9c58e60dc5ed3d866c53bc4c
Launching Web UI with arguments: --ckpt-dir I:/stable-diffusion-webui/models/Stable-diffusion --hypernetwork-dir I:/stable-diffusion-webui/models/hypernetworks --esrgan-models-path I:/stable-diffusion-webui/models/esrgan --vae-dir I:/stable-diffusion-webui/models/vae --embeddings-dir I:/stable-diffusion-webui/embeddings --lora-dir I:/stable-diffusion-webui/models/Lora --always-offload-from-vram
Total VRAM 24564 MB, total RAM 31960 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : native
VAE dtype: torch.bfloat16
Using pytorch cross attention
ControlNet preprocessor location: I:\stable-diffusion-webui-forge\models\ControlNetPreprocessor
Loading weights [31e35c80fc] from I:/stable-diffusion-webui/models/Stable-diffusion\SDXL\Multi Style\Stable Diffusion XL (Base) 1.0 - SDXL - StabilityAI.safetensors
2024-02-08 11:13:54,632 - ControlNet - INFO - ControlNet UI callback registered.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Startup time: 16.9s (prepare environment: 5.9s, import torch: 4.6s, import gradio: 1.2s, setup paths: 1.1s, initialize shared: 0.2s, other imports: 0.8s, list SD models: 0.3s, load scripts: 1.6s, create ui: 0.7s, gradio launch: 0.4s).
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
To load target model SDXLClipModel
Begin to load 1 model
Moving model(s) has taken 2.24 seconds
Model loaded in 12.0s (load weights from disk: 0.9s, forge load real models: 7.9s, load textual inversion embeddings: 0.4s, calculate empty prompt: 2.7s).
Token merging is under construction now and the setting will not take effect.
2024-02-08 11:48:12,096 - ControlNet - INFO - ControlNet Input Mode: InputMode.SIMPLE
2024-02-08 11:48:12,623 - ControlNet - INFO - Using preprocessor: tile_resample
2024-02-08 11:48:12,623 - ControlNet - INFO - preprocessor resolution = 0.5
2024-02-08 11:48:13,961 - ControlNet - INFO - Current ControlNet ControlLLLitePatcher: I:\stable-diffusion-webui-forge\models\ControlNet\XL Models\sd_control_collection\kohya_controllllite_xl_blur.safetensors
[Tiled Diffusion] upscaling image with ESRGAN-UltraSharp-4x...
Upscale script freed memory successfully.
tiled upscale: 100%|█████████████████████████████████████████████████████████████████| 448/448 [00:32<00:00, 13.82it/s]
*** Error running process: I:\stable-diffusion-webui-forge\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilevae.py
    Traceback (most recent call last):
      File "I:\stable-diffusion-webui-forge\modules\scripts.py", line 798, in process
        script.process(p, *script_args)
      File "I:\stable-diffusion-webui-forge\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilevae.py", line 716, in process
        if devices.get_optimal_device_name().startswith('cuda') and vae.device == devices.cpu and not vae_to_gpu:
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in __getattr__
        raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
    AttributeError: 'AutoencoderKL' object has no attribute 'device'

---
MixtureOfDiffusers Sampling: : 0it [00:00, ?it/s]Mixture of Diffusers hooked into 'DPM++ 2M Karras' sampler, Tile size: 96x96, Tile count: 364, Batch size: 4, Tile batches: 91
To load target model AutoencoderKL
Begin to load 1 model
Moving model(s) has taken 1.06 seconds
Warning: Ran out of memory when regular VAE encoding, retrying with tiled VAE encoding.
To load target model SDXLClipModel
Begin to load 1 model
Moving model(s) has taken 0.44 seconds
136 modules
2024-02-08 11:49:22,768 - ControlNet - INFO - ControlNet Method tile_resample patched.
To load target model SDXL
Begin to load 1 model
Moving model(s) has taken 35.12 seconds
  0%|                                                                                           | 0/34 [00:00<?, ?it/s]
*** Error completing request                                                                    | 0/34 [00:00<?, ?it/s]
*** Arguments: ('task(a0hglksvwxytshq)', 0, '(thick impasto painting:2), a vibrant textured impasto painting of a wave crashing against against the ocean in the foreground, with a colorful sky in the background, The wave is depicted in shades of blue, white, and turquoise with the foamy crest contrasting against the deep blue of the ocean, The sky is painted in hues of pink, orange, and yellow suggesting a sunrise, (oil painting:1.5), (masterpiece:1.25), 8k, cinematic lighting, (best quality:1.5), (detailed:1.5), (thick brushstrokes:1.5), (detailed brushstrokes:1.75), very high resolution, palette knife painting,   <lora:Etremely Detailed Sliders (Detail Improvement Effect) - V1.0 - SDXL- ntc:1>\n', 'ugly, (worst quality, normal quality, low quality:2.5), out of focus, bad painting, bad drawing, blurry, low resolution, (logo, text, signature, name, artist name, artist signature:2.5),  NegativeXL - A -Standard - gsdf, Pallets, wood, wood pallets, watermark, rocks, stones, (beach:1.5), sand, planks, log, (anime:2.5), (cartoon:2.5), manga, living room, bedroom, house, mountain, hill, beach, (noise:1.5)\n', [], <PIL.Image.Image image mode=RGBA size=5120x2880 at 0x23A29DE5600>, None, None, None, None, None, None, 60, 'DPM++ 2M Karras', 4, 0, 1, 1, 1, 7, 1.5, 0.55, 0.0, 2880, 5120, 1, 0, 0, 32, 0, '', '', '', [], False, [], '', <gradio.routes.Request object at 0x0000023A29E9ED40>, 0, False, 1, 0.5, 4, 0, 0.5, 2, False, '', 0.8, -1, False, -1, 0, 0, 0, UiControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=[], batch_mask_gallery=[], generated_image=None, mask_image=None, hr_option='Both', enabled=True, module='tile_resample', model='kohya_controllllite_xl_blur [22117d11]', weight=0.15, image=None, resize_mode='Crop and Resize', processor_res=0.5, threshold_a=0.5, threshold_b=0.5, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced'), UiControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=[], batch_mask_gallery=[], generated_image=None, mask_image=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced'), UiControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=[], batch_mask_gallery=[], generated_image=None, mask_image=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced'), False, 1.01, 1.02, 0.99, 0.95, False, 256, 2, 0, False, False, 3, 2, 0, 0.35, True, 'bicubic', 'bicubic', False, 0.5, 2, False, True, 'Mixture of Diffusers', False, True, 1024, 1024, 96, 96, 48, 4, 'ESRGAN-UltraSharp-4x', 2, False, 10, 1, 1, 64, True, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, True, 2816, 192, False, True, True, False, '* `CFG Scale` should be 2 or lower.', True, True, '', '', True, 50, True, 1, 0, False, 4, 0.5, 'Linear', 'None', '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, 'start', '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "I:\stable-diffusion-webui-forge\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "I:\stable-diffusion-webui-forge\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\modules\img2img.py", line 235, in img2img
        processed = process_images(p)
      File "I:\stable-diffusion-webui-forge\modules\processing.py", line 749, in process_images
        res = process_images_inner(p)
      File "I:\stable-diffusion-webui-forge\modules\processing.py", line 920, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "I:\stable-diffusion-webui-forge\modules\processing.py", line 1703, in sample
        samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)
      File "I:\stable-diffusion-webui-forge\modules\sd_samplers_kdiffusion.py", line 197, in sample_img2img
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "I:\stable-diffusion-webui-forge\modules\sd_samplers_common.py", line 260, in launch_sampling
        return func()
      File "I:\stable-diffusion-webui-forge\modules\sd_samplers_kdiffusion.py", line 197, in <lambda>
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\modules\sd_samplers_cfg_denoiser.py", line 182, in forward
        denoised = forge_sampler.forge_sample(self, denoiser_params=denoiser_params,
      File "I:\stable-diffusion-webui-forge\modules_forge\forge_sampler.py", line 82, in forge_sample
        denoised = sampling_function(model, x, timestep, uncond, cond, cond_scale, model_options, seed)
      File "I:\stable-diffusion-webui-forge\ldm_patched\modules\samplers.py", line 282, in sampling_function
        cond_pred, uncond_pred = calc_cond_uncond_batch(model, cond, uncond_, x, timestep, model_options)
      File "I:\stable-diffusion-webui-forge\ldm_patched\modules\samplers.py", line 253, in calc_cond_uncond_batch
        output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
      File "I:\stable-diffusion-webui-forge\ldm_patched\modules\model_base.py", line 85, in apply_model
        model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\ldm_patched\ldm\modules\diffusionmodules\openaimodel.py", line 860, in forward
        h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator)
      File "I:\stable-diffusion-webui-forge\ldm_patched\ldm\modules\diffusionmodules\openaimodel.py", line 48, in forward_timestep_embed
        x = layer(x, context, transformer_options)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\ldm_patched\ldm\modules\attention.py", line 613, in forward
        x = block(x, context=context[i], transformer_options=transformer_options)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\ldm_patched\ldm\modules\attention.py", line 440, in forward
        return checkpoint(self._forward, (x, context, transformer_options), self.parameters(), self.checkpoint)
      File "I:\stable-diffusion-webui-forge\ldm_patched\ldm\modules\diffusionmodules\util.py", line 189, in checkpoint
        return func(*inputs)
      File "I:\stable-diffusion-webui-forge\ldm_patched\ldm\modules\attention.py", line 479, in _forward
        n, context_attn1, value_attn1 = p(n, context_attn1, value_attn1, extra_options)
      File "I:\stable-diffusion-webui-forge\extensions-builtin\sd_forge_controlllite\lib_controllllite\lib_controllllite.py", line 102, in __call__
        q = q + self.modules[module_pfx_to_q](q)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "I:\stable-diffusion-webui-forge\extensions-builtin\sd_forge_controlllite\lib_controllllite\lib_controllllite.py", line 234, in forward
        cx = torch.cat([cx, self.down(x)], dim=1 if self.is_conv2d else 2)
    RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 57600 but got size 230400 for tensor number 1 in the list.

---

Additional information

Vigilence commented 9 months ago

I believe I found out part of the issue.

The "Resize to" or "Resize by" size/dimensions must match the same scale factor size in the tiled diffusion section.

So if I want to resize a 1000x1000 image by 2 using tiled diffusion, then the "resize to" must be set to 2000x2000, or the "resize by" must be set to 2.

However an issue, which I believe is tied to forge, persists. This issue is not present in regular automatic1111.

Which is: 'AutoencoderKL' object has no attribute 'device'

*** Error running process: I:\stable-diffusion-webui-forge\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilevae.py
    Traceback (most recent call last):
      File "I:\stable-diffusion-webui-forge\modules\scripts.py", line 798, in process
        script.process(p, *script_args)
      File "I:\stable-diffusion-webui-forge\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilevae.py", line 716, in process
        if devices.get_optimal_device_name().startswith('cuda') and vae.device == devices.cpu and not vae_to_gpu:
      File "I:\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in __getattr__
        raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
    AttributeError: 'AutoencoderKL' object has no attribute 'device'

---
lllyasviel commented 9 months ago

Multi-Diffusion is intergrated now, feel free to open other issues if they have some problems