[Bug]: UI Crashing when running XYZ plot comparing 3 XL checkpoints

Checklist

[ ] The issue exists after disabling all extensions
[ ] The issue exists on a clean installation of webui
[X] The issue is caused by an extension, but I believe it is caused by a bug in the webui
[X] The issue exists in the current version of the webui
[X] The issue has not been reported before recently
[ ] The issue has been reported before but has not been fixed yet

What happened?

When running an XYZ plot to compare between different XL checkpoints, there is some kind of nondescript failure upon switching to the third checkpoint, and the terminal crashes. This doesn't happen with 1.5 checkpoints, and it doesn't happen with the original Auto1111 either; on the base webUI, the plot finishes generating without issue, even with an extension like adetailer activated.

I have verified that this doesn't happen if I disable third-party extensions on Forge, but given that A1111 doesn't have this problem either and my settings on both are as similar as possible, I am very curious as to what is wrong. This may very well be user error, but if that's the case, I would love to know what to do to resolve the problem.

Steps to reproduce the problem

Select Script: X/Y/Z Plot
Select 3 XL checkpoints for the X axis
(optional; issue occurs with or without a Y axis) Choose whatever for the Y axis, such as a prompt S/R
Watch as the first two checkpoints perform without issue and then, when trying to switch to the third one, the UI hangs with a message that reads Press any key to continue . . .
Upon pressing any key, the terminal closes.

What should have happened?

I mean, I'd at least appreciate some kind of error message or whatnot, but ideally, I would like to be able to complete the X/Y/Z plot.

What browsers do you use to access the UI ?

Mozilla Firefox

Sysinfo

sysinfo-2024-02-11-18-53.json

Console logs

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f0.0.12-latest-114-g8316773c
Commit hash: 8316773caaa3327ab205ae3b6250bbe07374d247
loading WD14-tagger reqs from E:\webui_forge_cu121_torch21\webui\extensions\stable-diffusion-webui-wd14-tagger\requirements.txt
Checking WD14-tagger requirements.
Launching Web UI with arguments: --ckpt-dir=G:\stable-diffusion-webui/models/Stable-diffusion
Total VRAM 12287 MB, total RAM 16296 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 3060 : native
VAE dtype: torch.bfloat16
2024-02-11 17:58:40.228176: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
WARNING:tensorflow:From E:\webui_forge_cu121_torch21\system\python\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

Using pytorch cross attention
ControlNet preprocessor location: E:\webui_forge_cu121_torch21\webui\models\ControlNetPreprocessor
Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
[-] ADetailer initialized. version: 24.1.2, num models: 9
== WD14 tagger /gpu:0, uname_result(system='Windows', node='Tozé-II', release='10', version='10.0.19045', machine='AMD64') ==
Loading weights [67ab2fd8ec] from G:\stable-diffusion-webui/models/Stable-diffusion\ponyDiffusionV6XL_v6.safetensors
2024-02-11 17:59:21,523 - ControlNet - INFO - ControlNet UI callback registered.
model_type EPS
UNet ADM Dimension 2816
Scanning <DirEntry 'deepdanbooru-v3-20211112-sgd-e28'> as deepdanbooru project
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 99.9s (initial startup: 0.3s, prepare environment: 24.5s, import torch: 17.4s, import gradio: 7.9s, setup paths: 28.5s, initialize shared: 1.1s, other imports: 4.8s, setup gfpgan: 0.3s, list SD models: 0.6s, load scripts: 10.1s, cleanup temp dir: 1.1s, create ui: 3.8s, gradio launch: 0.5s).
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
loaded straight to GPU
To load target model SDXL
Begin to load 1 model
To load target model SDXLClipModel
Begin to load 1 model
Moving model(s) has taken 0.71 seconds
Model loaded in 30.3s (load weights from disk: 0.5s, forge instantiate config: 2.4s, forge load real models: 23.9s, load textual inversion embeddings: 0.2s, calculate empty prompt: 3.2s).
X/Y/Z plot will create 6 images on 1 3x2 grid. (Total steps to process: 150)
100%|██████████████████████████████████████████████████████████████████████████████████| 25/25 [00:20<00:00,  1.24it/s]
To load target model AutoencoderKL
Begin to load 1 model
Moving model(s) has taken 0.11 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 25/25 [00:16<00:00,  1.51it/s]
Loading weights [821aa5537f] from G:\stable-diffusion-webui/models/Stable-diffusion\testing testing\autismmixSDXL_autismmixPony.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
loaded straight to GPU
To load target model SDXL
Begin to load 1 model
Moving model(s) has taken 0.15 seconds
To load target model SDXLClipModel
Begin to load 1 model
Moving model(s) has taken 0.78 seconds
Model loaded in 31.6s (unload existing model: 3.8s, load weights from disk: 0.3s, forge instantiate config: 1.0s, forge load real models: 23.3s, forge finalize: 0.4s, load textual inversion embeddings: 0.3s, calculate empty prompt: 2.4s).
100%|██████████████████████████████████████████████████████████████████████████████████| 25/25 [00:17<00:00,  1.42it/s]
To load target model AutoencoderKL
Begin to load 1 model
Moving model(s) has taken 0.11 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 25/25 [00:16<00:00,  1.52it/s]
Loading weights [ac006fdd7e] from G:\stable-diffusion-webui/models/Stable-diffusion\testing testing\autismmixSDXL_autismmixConfetti.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
Press any key to continue . . .

Additional information

No response

A small update: I'm not sure what I did differently, but this time I got an actual error before the UI crashed. Again, this is upon trying to switch to the third XL model in a plot.

It seems pretty clear to me that this comes down to a difference in memory management between Forge and base A1111, but I'm frankly a techlet, so if there's a clear way to solve this on my end I'd love some input (and the issue is on Forge's end, well, let it be known there's an issue).


generating image for xyz plot: OutOfMemoryError
Traceback (most recent call last):
  File "E:\webui_forge_cu121_torch21\webui\scripts\xyz_grid.py", line 723, in cell
    res = process_images(pc)
  File "E:\webui_forge_cu121_torch21\webui\modules\processing.py", line 743, in process_images
    sd_models.reload_model_weights()
  File "E:\webui_forge_cu121_torch21\webui\modules\sd_models.py", line 628, in reload_model_weights
    return load_model(info)
  File "E:\webui_forge_cu121_torch21\webui\modules\sd_models.py", line 585, in load_model
    sd_model = forge_loader.load_model_for_a1111(timer=timer, checkpoint_info=checkpoint_info, state_dict=state_dict)
  File "E:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\webui_forge_cu121_torch21\webui\modules_forge\forge_loader.py", line 154, in load_model_for_a1111
    forge_objects = load_checkpoint_guess_config(
  File "E:\webui_forge_cu121_torch21\webui\modules_forge\forge_loader.py", line 104, in load_checkpoint_guess_config
    model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\modules\supported_models.py", line 176, in get_model
    out = model_base.SDXL(self, model_type=self.model_type(state_dict, prefix), device=device)
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\modules\model_base.py", line 297, in __init__
    super().__init__(model_config, model_type, device=device)
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\modules\model_base.py", line 55, in __init__
    self.diffusion_model = UNetModel(**unet_config, device=device, operations=operations)
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\diffusionmodules\openaimodel.py", line 788, in __init__
    get_attention_layer(
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\diffusionmodules\openaimodel.py", line 571, in get_attention_layer
    return SpatialTransformer(
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 593, in __init__
    [BasicTransformerBlock(inner_dim, n_heads, d_head, dropout=dropout, context_dim=context_dim[d],
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 593, in <listcomp>
    [BasicTransformerBlock(inner_dim, n_heads, d_head, dropout=dropout, context_dim=context_dim[d],
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 423, in __init__
    self.ff = FeedForward(inner_dim, dim_out=dim, dropout=dropout, glu=gated_ff, dtype=dtype, device=device, operations=operations)
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 80, in __init__
    ) if not glu else GEGLU(dim, inner_dim, dtype=dtype, device=device, operations=operations)
  File "E:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 65, in __init__
    self.proj = operations.Linear(dim_in, dim_out * 2, dtype=dtype, device=device)
  File "E:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\linear.py", line 96, in __init__
    self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 12.00 GiB of which 7.37 GiB is free. Of the allocated memory 3.37 GiB is allocated by PyTorch, and 180.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Loading weights [ac006fdd7e] from G:\stable-diffusion-webui/models/Stable-diffusion\testing testing\autismmixSDXL_autismmixConfetti.safetensors
Press any key to continue . . .```

lllyasviel / stable-diffusion-webui-forge