lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
5.34k stars 526 forks source link

[Bug]: Kohya HRFix and Self Attention Guidance do not work together when Downscale Factor is not exactly 2 #331

Open gg8345 opened 4 months ago

gg8345 commented 4 months ago

Checklist

What happened?

If I have only Kohya HRFix and SelfAttentionGuidance on (unload all other extensions), I can set Kohya HRFix downsampling scale to 1.5 for example. But then I hit an error when trying to generate an image

 File "D:\sd-webui-forge\webui\ldm_patched\contrib\external_sag.py", line 68, in create_blur_map
    mask.reshape(b, *mid_shape)
RuntimeError: shape '[1, 24, 24]' is invalid for input of size 1024

The only way to proceed is to either set the downsampling scale to 2 or disable SelfAttentionGuidance

Steps to reproduce the problem

1- run Webui Forge 2- in the Text2Img tab, Click enabled on Kohya HRFix and SelfAttentionGuidance 3- Set the Kohya HRFix downsampling scale to something else than 2 (example : 1.5, 1.7 and so on) 4- hit Generate. 5- the error should appear on the console

What should have happened?

Instead of an error there should have been an image generated

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

{ "Platform": "Windows-10-10.0.22631-SP0", "Python": "3.10.6", "Version": "f0.0.12-latest-155-gd81e353d", "Commit": "d81e353d8928147bbd973068d0efbb2802affe0f", "Script path": "D:\sd-webui-forge\webui", "Data path": "D:\sd-webui-forge\webui", "Extensions dir": "D:\sd-webui-forge\webui\extensions", "Checksum": "7b1c3916c86237f9fba29bb7d54712c65b015caee41ec25bcda3b52bf469a782", "Commandline": [ "launch.py" ], "Torch env info": { "torch_version": "2.1.2+cu121", "is_debug_build": "False", "cuda_compiled_version": "12.1", "gcc_version": null, "clang_version": null, "cmake_version": "version 3.28.1", "os": "Microsoft Windows 11 Enterprise", "libc_version": "N/A", "python_version": "3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] (64-bit runtime)", "python_platform": "Windows-10-10.0.22631-SP0", "is_cuda_available": "True", "cuda_runtime_version": null, "cuda_module_loading": "LAZY", "nvidia_driver_version": "526.98", "nvidia_gpu_models": "GPU 0: NVIDIA GeForce RTX 2080", "cudnn_version": null, "pip_version": "pip3", "pip_packages": [ "numpy==1.26.2", "open-clip-torch==2.20.0", "pytorch-lightning==1.9.4", "torch==2.1.2+cu121", "torchdiffeq==0.2.3", "torchmetrics==1.3.0.post0", "torchsde==0.2.6", "torchvision==0.16.2+cu121" ], "conda_packages": null, "hip_compiled_version": "N/A", "hip_runtime_version": "N/A", "miopen_runtime_version": "N/A", "caching_allocator_config": "", "is_xnnpack_available": "True", "cpu_info": [ "Architecture=9", "CurrentClockSpeed=3801", "DeviceID=CPU0", "Family=107", "L2CacheSize=4096", "L2CacheSpeed=", "Manufacturer=AuthenticAMD", "MaxClockSpeed=3801", "Name=AMD Ryzen 7 5800X 8-Core Processor ", "ProcessorType=3", "Revision=8448" ] }, "Exceptions": [ { "exception": "'NoneType' object is not iterable", "traceback": [ [ "D:\sd-webui-forge\webui\modules\call_queue.py, line 57, f", "res = list(func(*args, **kwargs))" ] ] } ], "CPU": { "model": "AMD64 Family 25 Model 33 Stepping 0, AuthenticAMD", "count logical": 16, "count physical": 8 }, "RAM": { "total": "32GB", "used": "18GB", "free": "14GB" }, "Extensions": [], "Inactive extensions": [ { "name": "adetailer", "path": "D:\sd-webui-forge\webui\extensions\adetailer", "version": "8f01dfda", "branch": "main", "remote": "https://github.com/Bing-su/adetailer.git" }, { "name": "openpose-editor", "path": "D:\sd-webui-forge\webui\extensions\openpose-editor", "version": "c9357715", "branch": "master", "remote": "https://github.com/fkunn1326/openpose-editor.git" }, { "name": "sd-webui-regional-prompter", "path": "D:\sd-webui-forge\webui\extensions\sd-webui-regional-prompter", "version": "59d68e6e", "branch": "main", "remote": "https://github.com/hako-mikan/sd-webui-regional-prompter.git" } ], "Environment": { "GRADIO_ANALYTICS_ENABLED": "False" }, "Config": { "ldsr_steps": 100, "ldsr_cached": false, "SCUNET_tile": 256, "SCUNET_tile_overlap": 8, "SWIN_tile": 192, "SWIN_tile_overlap": 8, "SWIN_torch_compile": false, "control_net_detectedmap_dir": "detected_maps", "control_net_models_path": "", "control_net_modules_path": "", "control_net_unit_count": 3, "control_net_model_cache_size": 5, "control_net_no_detectmap": false, "control_net_detectmap_autosaving": false, "control_net_allow_script_control": false, "control_net_sync_field_args": true, "controlnet_show_batch_images_in_ui": false, "controlnet_increment_seed_during_batch": false, "controlnet_disable_openpose_edit": false, "controlnet_disable_photopea_edit": false, "controlnet_photopea_warning": true, "controlnet_input_thumbnail": true, "sd_checkpoint_hash": "d8fd60692a589f3be4a4c205ae4fa5a1d686b44a1cc20c7953715a95ab5070cf", "sd_model_checkpoint": "leosamsHelloworldXL_helloworldXL50GPT4V.safetensors [d8fd60692a]", "CLIP_stop_at_last_layers": 1, "disabled_extensions": [ "LDSR", "Lora", "ScuNET", "SwinIR", "canvas-zoom-and-pan", "extra-options-section", "forge_legacy_preprocessors", "forge_preprocessor_inpaint", "forge_preprocessor_marigold", "forge_preprocessor_normalbae", "forge_preprocessor_recolor", "forge_preprocessor_reference", "forge_preprocessor_revision", "forge_preprocessor_tile", "mobile", "prompt-bracket-checker", "sd_forge_controlllite", "sd_forge_controlnet", "sd_forge_controlnet_example", "sd_forge_dynamic_thresholding", "sd_forge_fooocus_inpaint", "sd_forge_freeu", "sd_forge_hypertile", "sd_forge_ipadapter", "sd_forge_latent_modifier", "sd_forge_multidiffusion", "sd_forge_photomaker", "sd_forge_stylealign", "sd_forge_svd", "sd_forge_z123", "soft-inpainting", "adetailer", "openpose-editor", "sd-webui-regional-prompter" ], "disable_all_extensions": "none", "ad_max_models": 2, "ad_extra_models_dir": "", "ad_save_previews": false, "ad_save_images_before": false, "ad_only_seleted_scripts": true, "ad_script_names": "dynamic_prompting,dynamic_thresholding,wildcard_recursive,wildcards,lora_block_weight,negpip", "ad_bbox_sortby": "None", "ad_same_seed_for_each_tap": false, "multiple_tqdm": true, "regprp_debug": false, "regprp_hidepmask": false, "batch_cond_uncond": true, "lora_functional": false, "save_images_before_highres_fix": false }, "Startup": { "total": 9.306849241256714, "records": { "initial startup": 0.024300813674926758, "prepare environment/checks": 0.062056779861450195, "prepare environment/git version info": 0.10259628295898438, "prepare environment/torch GPU test": 1.8199667930603027, "prepare environment/clone repositores": 0.46443963050842285, "prepare environment/run extensions installers": 0.0, "prepare environment/run extensions_builtin installers/sd_forge_kohya_hrfix": 0.001001119613647461, "prepare environment/run extensions_builtin installers/sd_forge_sag": 0.0, "prepare environment/run extensions_builtin installers": 0.001001119613647461, "prepare environment": 2.4810893535614014, "launcher": 0.002505064010620117, "import torch": 4.6677610874176025, "import gradio": 0.0, "setup paths": 0.0, "import ldm": 0.002002239227294922, "import sgm": 0.0, "initialize shared": 0.18117070198059082, "other imports": 0.44494080543518066, "opts onchange": 0.0010008811950683594, "setup SD model": 0.0, "setup codeformer": 0.0, "setup gfpgan": 0.009511709213256836, "set samplers": 0.0, "list extensions": 0.004003763198852539, "restore config state file": 0.0, "list SD models": 0.0020017623901367188, "list localizations": 0.0010008811950683594, "load scripts/custom_code.py": 0.0010008811950683594, "load scripts/img2imgalt.py": 0.001001119613647461, "load scripts/loopback.py": 0.0, "load scripts/outpainting_mk_2.py": 0.0, "load scripts/poor_mans_outpainting.py": 0.0, "load scripts/postprocessing_caption.py": 0.0, "load scripts/postprocessing_codeformer.py": 0.0010008811950683594, "load scripts/postprocessing_create_flipped_copies.py": 0.0, "load scripts/postprocessing_focal_crop.py": 0.0010006427764892578, "load scripts/postprocessing_gfpgan.py": 0.001001119613647461, "load scripts/postprocessing_split_oversized.py": 0.0, "load scripts/postprocessing_upscale.py": 0.0, "load scripts/processing_autosized_crop.py": 0.0010008811950683594, "load scripts/prompt_matrix.py": 0.0, "load scripts/prompts_from_file.py": 0.0, "load scripts/sd_upscale.py": 0.0, "load scripts/xyz_grid.py": 0.11260390281677246, "load scripts/kohya_hrfix.py": 0.003002643585205078, "load scripts/forge_sag.py": 0.0010008811950683594, "load scripts/comments.py": 0.7321884632110596, "load scripts/refiner.py": 0.0, "load scripts/seed.py": 0.0010008811950683594, "load scripts": 0.8558022975921631, "load upscalers": 0.0020020008087158203, "refresh VAE": 0.0010006427764892578, "refresh textual inversion templates": 0.0, "scripts list_optimizers": 0.023021697998046875, "scripts list_unets": 0.0, "reload hypernetworks": 0.0, "initialize extra networks": 0.008007049560546875, "scripts before_ui_callback": 0.0, "create ui": 0.24624967575073242, "gradio launch": 0.36649227142333984, "add APIs": 0.014013290405273438, "app_started_callback": 0.0 } }, "Packages": [ "absl-py==2.1.0", "accelerate==0.21.0", "addict==2.4.0", "aenum==3.1.15", "aiofiles==23.2.1", "aiohttp==3.9.3", "aiosignal==1.3.1", "albumentations==1.3.1", "altair==5.2.0", "antlr4-python3-runtime==4.9.3", "anyio==3.7.1", "async-timeout==4.0.3", "attrs==23.2.0", "basicsr==1.4.2", "blendmodes==2022", "certifi==2024.2.2", "cffi==1.16.0", "chardet==5.2.0", "charset-normalizer==3.3.2", "clean-fid==0.1.35", "click==8.1.7", "clip==1.0", "colorama==0.4.6", "coloredlogs==15.0.1", "colorlog==6.8.2", "contourpy==1.2.0", "cssselect2==0.7.0", "cycler==0.12.1", "cython==3.0.8", "deprecation==2.1.0", "depth-anything==2024.1.22.0", "diffusers==0.25.0", "easydict==1.11", "einops==0.4.1", "embreex==2.17.7.post4", "exceptiongroup==1.2.0", "facexlib==0.3.0", "fastapi==0.94.0", "ffmpy==0.3.1", "filelock==3.13.1", "filterpy==1.4.5", "flatbuffers==23.5.26", "fonttools==4.47.2", "frozenlist==1.4.1", "fsspec==2024.2.0", "ftfy==6.1.3", "future==0.18.3", "fvcore==0.1.5.post20221221", "gitdb==4.0.11", "gitpython==3.1.32", "gradio-client==0.5.0", "gradio==3.41.2", "grpcio==1.60.1", "h11==0.12.0", "handrefinerportable==2024.2.12.0", "httpcore==0.15.0", "httpx==0.24.1", "huggingface-hub==0.20.3", "humanfriendly==10.0", "idna==3.6", "imageio==2.33.1", "importlib-metadata==7.0.1", "importlib-resources==6.1.1", "inflection==0.5.1", "insightface==0.7.3", "iopath==0.1.9", "jinja2==3.1.3", "joblib==1.3.2", "jsonmerge==1.8.0", "jsonschema-specifications==2023.12.1", "jsonschema==4.21.1", "kiwisolver==1.4.5", "kornia==0.6.7", "lark==1.1.2", "lazy-loader==0.3", "lightning-utilities==0.10.1", "llvmlite==0.42.0", "lmdb==1.4.1", "lxml==5.1.0", "mapbox-earcut==1.0.1", "markdown-it-py==3.0.0", "markdown==3.5.2", "markupsafe==2.1.5", "matplotlib==3.8.2", "mdurl==0.1.2", "mediapipe==0.10.9", "mpmath==1.3.0", "multidict==6.0.5", "networkx==3.2.1", "numba==0.59.0", "numpy==1.26.2", "omegaconf==2.2.3", "onnx==1.15.0", "onnxruntime==1.17.0", "open-clip-torch==2.20.0", "opencv-contrib-python==4.9.0.80", "opencv-python-headless==4.9.0.80", "opencv-python==4.9.0.80", "orjson==3.9.13", "packaging==23.2", "pandas==2.2.0", "piexif==1.1.3", "pillow==9.5.0", "pip==23.2.1", "platformdirs==4.2.0", "portalocker==2.8.2", "prettytable==3.9.0", "protobuf==3.20.0", "psutil==5.9.5", "py-cpuinfo==9.0.0", "pycollada==0.8", "pycparser==2.21", "pydantic==1.10.14", "pydub==0.25.1", "pygments==2.17.2", "pyparsing==3.1.1", "pyreadline3==3.4.1", "python-dateutil==2.8.2", "python-multipart==0.0.7", "pytorch-lightning==1.9.4", "pytz==2024.1", "pywavelets==1.5.0", "pywin32==306", "pyyaml==6.0.1", "qudida==0.0.4", "referencing==0.33.0", "regex==2023.12.25", "reportlab==4.0.9", "requests==2.31.0", "resize-right==0.0.2", "rich==13.7.0", "rpds-py==0.17.1", "rtree==1.2.0", "safetensors==0.4.2", "scikit-image==0.21.0", "scikit-learn==1.4.0", "scipy==1.12.0", "seaborn==0.13.2", "semantic-version==2.10.0", "sentencepiece==0.1.99", "setuptools==69.0.3", "shapely==2.0.2", "six==1.16.0", "smmap==5.0.1", "sniffio==1.3.0", "sounddevice==0.4.6", "spandrel==0.1.6", "starlette==0.26.1", "svg.path==6.3", "svglib==1.5.1", "sympy==1.12", "tabulate==0.9.0", "tb-nightly==2.16.0a20240204", "tensorboard-data-server==0.7.2", "termcolor==2.4.0", "tf-keras-nightly==2.16.0.dev2024020410", "thop==0.1.1.post2209072238", "threadpoolctl==3.2.0", "tifffile==2024.1.30", "timm==0.9.12", "tinycss2==1.2.1", "tokenizers==0.13.3", "tomesd==0.1.3", "tomli==2.0.1", "toolz==0.12.1", "torch==2.1.2+cu121", "torchdiffeq==0.2.3", "torchmetrics==1.3.0.post0", "torchsde==0.2.6", "torchvision==0.16.2+cu121", "tqdm==4.66.1", "trampoline==0.1.2", "transformers==4.30.2", "trimesh==4.1.3", "typing-extensions==4.9.0", "tzdata==2023.4", "ultralytics==8.1.10", "urllib3==2.2.0", "uvicorn==0.27.0.post1", "vhacdx==0.0.5", "wcwidth==0.2.13", "webencodings==0.5.1", "websockets==11.0.3", "werkzeug==3.0.1", "wheel==0.42.0", "xxhash==3.4.1", "yacs==0.1.8", "yapf==0.40.2", "yarl==1.9.4", "zipp==3.17.0" ] }

Console logs

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f0.0.12-latest-155-gd81e353d
Commit hash: d81e353d8928147bbd973068d0efbb2802affe0f
Launching Web UI with arguments:
Total VRAM 8192 MB, total RAM 32694 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 : native
VAE dtype: torch.float32
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
D:\sd-webui-forge\system\python\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
Using pytorch cross attention
Loading weights [d8fd60692a] from D:\sd-webui-forge\webui\models\Stable-diffusion\leosamsHelloworldXL_helloworldXL50GPT4V.safetensors
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 9.3s (prepare environment: 2.5s, import torch: 4.7s, initialize shared: 0.2s, other imports: 0.4s, load scripts: 0.9s, create ui: 0.2s, gradio launch: 0.4s).
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
To load target model SDXLClipModel
Begin to load 1 model
Moving model(s) has taken 0.48 seconds
Model loaded in 5.8s (load weights from disk: 0.3s, forge instantiate config: 1.1s, forge load real models: 3.4s, calculate empty prompt: 0.8s).
To load target model SDXL
Begin to load 1 model
Moving model(s) has taken 1.18 seconds
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "D:\sd-webui-forge\webui\modules_forge\main_thread.py", line 37, in loop
    task.work()
  File "D:\sd-webui-forge\webui\modules_forge\main_thread.py", line 26, in work
    self.result = self.func(*self.args, **self.kwargs)
  File "D:\sd-webui-forge\webui\modules\txt2img.py", line 111, in txt2img_function
    processed = processing.process_images(p)
  File "D:\sd-webui-forge\webui\modules\processing.py", line 750, in process_images
    res = process_images_inner(p)
  File "D:\sd-webui-forge\webui\modules\processing.py", line 921, in process_images_inner
    samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
  File "D:\sd-webui-forge\webui\modules\processing.py", line 1276, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "D:\sd-webui-forge\webui\modules\sd_samplers_kdiffusion.py", line 251, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "D:\sd-webui-forge\webui\modules\sd_samplers_common.py", line 263, in launch_sampling
    return func()
  File "D:\sd-webui-forge\webui\modules\sd_samplers_kdiffusion.py", line 251, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "D:\sd-webui-forge\system\python\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\sd-webui-forge\webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "D:\sd-webui-forge\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\sd-webui-forge\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\sd-webui-forge\webui\modules\sd_samplers_cfg_denoiser.py", line 182, in forward
    denoised = forge_sampler.forge_sample(self, denoiser_params=denoiser_params,
  File "D:\sd-webui-forge\webui\modules_forge\forge_sampler.py", line 82, in forge_sample
    denoised = sampling_function(model, x, timestep, uncond, cond, cond_scale, model_options, seed)
  File "D:\sd-webui-forge\webui\ldm_patched\modules\samplers.py", line 303, in sampling_function
    cfg_result = fn(args)
  File "D:\sd-webui-forge\webui\ldm_patched\contrib\external_sag.py", line 152, in post_cfg_function
    degraded = create_blur_map(uncond_pred, uncond_attn, sag_sigma, sag_threshold)
  File "D:\sd-webui-forge\webui\ldm_patched\contrib\external_sag.py", line 68, in create_blur_map
    mask.reshape(b, *mid_shape)
RuntimeError: shape '[1, 8, 8]' is invalid for input of size 100
shape '[1, 8, 8]' is invalid for input of size 100
*** Error completing request
*** Arguments: ('task(007gsxrd3l42i0x)', <gradio.routes.Request object at 0x00000210A019E770>, '', '', [], 20, 'DPM++ 2M Karras', 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], 0, False, '', 0.8, -1, False, -1, 0, 0, 0, True, 3, 1.7, 0, 0.35, True, 'bicubic', 'bicubic', True, 0.5, 2, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "D:\sd-webui-forge\webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
    TypeError: 'NoneType' object is not iterable

---

Additional information

No response

Kanareika commented 4 months ago

Can confirm. But, actually, you still can tweak Downscale Factor a bit, like 0,1 or less down, and more up.