[Issue]: Custom text encoder with Flux fails

Issue Description

I am trying to use custom text encoders with Flux and failing with
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::BFloat16 key.dtype: struct c10::BFloat16 and value.dtype: float instead.
I am running in BF16, no upcasting
Tried multiple text encoders, same error.
Version Platform Description

PS C:\ai\automatic> .\webui.bat --debug Using VENV: C:\ai\automatic\venv 08:48:46-962516 INFO Starting SD.Next 08:48:46-965989 INFO Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create 08:48:46-967476 INFO Python: version=3.11.9 platform=Windows bin="C:\ai\automatic\venv\Scripts\python.exe" venv="C:\ai\automatic\venv" 08:48:47-196181 INFO Version: app=sd.next updated=2024-10-01 hash=bd6e689b branch=dev url=https://github.com/vladmandic/automatic/tree/dev ui=dev 08:48:48-030975 INFO Repository latest available e7ec07f9783701629ca1411ad82aec87232501b9 2024-09-13T16:51:56Z 08:48:48-044672 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows release=Windows-10-10.0.22631-SP0 python=3.11.9
Relevant log output

PS C:\ai\automatic> .\webui.bat --debug
Using VENV: C:\ai\automatic\venv
08:48:46-962516 INFO     Starting SD.Next
08:48:46-965989 INFO     Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create
08:48:46-967476 INFO     Python: version=3.11.9 platform=Windows bin="C:\ai\automatic\venv\Scripts\python.exe"
                         venv="C:\ai\automatic\venv"
08:48:47-196181 INFO     Version: app=sd.next updated=2024-10-01 hash=bd6e689b branch=dev
                         url=https://github.com/vladmandic/automatic/tree/dev ui=dev
08:48:48-030975 INFO     Repository latest available e7ec07f9783701629ca1411ad82aec87232501b9 2024-09-13T16:51:56Z
08:48:48-044672 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
                         release=Windows-10-10.0.22631-SP0 python=3.11.9
08:48:48-046656 DEBUG    Setting environment tuning
08:48:48-047648 DEBUG    Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
08:48:48-059056 DEBUG    Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False zluda=False
08:48:48-070500 INFO     CUDA: nVidia toolkit detected
08:48:48-161340 WARNING  Modified files: ['models/Reference/playgroundai--playground-v2-1024px-aesthetic.jpg']
08:48:48-256076 INFO     Verifying requirements
08:48:48-260044 INFO     Verifying packages
08:48:48-308377 DEBUG    Timestamp repository update time: Wed Oct  2 04:26:21 2024
08:48:48-309369 INFO     Startup: standard
08:48:48-310361 INFO     Verifying submodules
08:48:50-942042 DEBUG    Git detached head detected: folder="extensions-builtin/sd-extension-chainner" reattach=main
08:48:50-943034 DEBUG    Git submodule: extensions-builtin/sd-extension-chainner / main
08:48:51-079970 DEBUG    Git detached head detected: folder="extensions-builtin/sd-extension-system-info" reattach=main
08:48:51-080964 DEBUG    Git submodule: extensions-builtin/sd-extension-system-info / main
08:48:51-219371 DEBUG    Git detached head detected: folder="extensions-builtin/sd-webui-agent-scheduler" reattach=main
08:48:51-220365 DEBUG    Git submodule: extensions-builtin/sd-webui-agent-scheduler / main
08:48:51-410791 DEBUG    Git detached head detected: folder="extensions-builtin/sdnext-modernui" reattach=dev
08:48:51-411783 DEBUG    Git submodule: extensions-builtin/sdnext-modernui / dev
08:48:51-592829 DEBUG    Git detached head detected: folder="extensions-builtin/stable-diffusion-webui-rembg"
                         reattach=master
08:48:51-594317 DEBUG    Git submodule: extensions-builtin/stable-diffusion-webui-rembg / master
08:48:51-736915 DEBUG    Git detached head detected: folder="modules/k-diffusion" reattach=master
08:48:51-738404 DEBUG    Git submodule: modules/k-diffusion / master
08:48:51-875515 DEBUG    Git detached head detected: folder="wiki" reattach=master
08:48:51-876507 DEBUG    Git submodule: wiki / master
08:48:51-955868 DEBUG    Register paths
08:48:52-047130 DEBUG    Installed packages: 217
08:48:52-049145 DEBUG    Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sdnext-modernui',
                         'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg']
08:48:52-231176 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\sd-extension-system-info\install.py
08:48:52-428088 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\sd-webui-agent-scheduler\install.py
08:48:55-111079 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\sd-webui-controlnet\install.py
08:49:03-785916 DEBUG    Extension installer:
                         C:\ai\automatic\extensions-builtin\stable-diffusion-webui-images-browser\install.py
08:49:06-895450 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\stable-diffusion-webui-rembg\install.py
08:49:16-644546 DEBUG    Extensions all: []
08:49:16-645539 INFO     Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sdnext-modernui',
                         'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg']
08:49:16-647026 INFO     Verifying requirements
08:49:16-648018 DEBUG    Setup complete without errors: 1727848157
08:49:16-654962 DEBUG    Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
08:49:16-656450 DEBUG    Starting module: <module 'webui' from 'C:\\ai\\automatic\\webui.py'>
08:49:16-657938 INFO     Command line args: ['--debug'] debug=True
08:49:16-658434 DEBUG    Env flags: []
08:49:23-464806 INFO     System packages: {'torch': '2.4.0+cu124', 'diffusers': '0.31.0.dev0', 'gradio': '3.43.2',
                         'transformers': '4.45.1', 'accelerate': '0.33.0'}
08:49:24-021256 DEBUG    Huggingface cache: folder="C:\Users\sebas\.cache\huggingface\hub"
08:49:24-135880 INFO     Device detect: memory=24.0 optimization=none
08:49:24-137864 DEBUG    Read: file="config.json" json=37 bytes=1566 time=0.000
08:49:24-139353 DEBUG    Unknown settings: ['cross_attention_options']
08:49:24-141336 INFO     Engine: backend=Backend.DIFFUSERS compute=None device=cuda attention="Scaled-Dot-Product"
                         mode=no_grad
08:49:24-142824 DEBUG    Read: file="html\reference.json" json=52 bytes=29118 time=0.000
08:49:24-194432 INFO     Torch parameters: backend=cuda device=cuda config=BF16 dtype=torch.bfloat16 vae=torch.bfloat16
                         unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upscast=False
                         deterministic=False test-fp16=True test-bf16=True optimization="Scaled-Dot-Product"
08:49:24-479761 DEBUG    ONNX: version=1.19.2 provider=CPUExecutionProvider, available=['AzureExecutionProvider',
                         'CPUExecutionProvider']
08:49:24-642178 INFO     Device: device=NVIDIA GeForce RTX 4090 n=1 arch=sm_90 capability=(8, 9) cuda=12.4 cudnn=90100
                         driver=561.09
08:49:24-728978 DEBUG    Importing LDM
08:49:24-749810 DEBUG    Entering start sequence
08:49:24-752290 DEBUG    Initializing
08:49:24-782050 INFO     Available VAEs: path="models\VAE" items=0
08:49:24-783538 INFO     Available UNets: path="models\UNET" items=0
08:49:24-785026 INFO     Available TEs: path="models\Text-encoder" items=4
08:49:24-787506 INFO     Disabled extensions: ['sd-webui-controlnet', 'sdnext-modernui']
08:49:24-789490 DEBUG    Read: file="cache.json" json=2 bytes=11134 time=0.000
08:49:24-798914 DEBUG    Read: file="metadata.json" json=708 bytes=2521081 time=0.008
08:49:24-803874 DEBUG    Scanning diffusers cache: folder="models\Diffusers" items=3 time=0.00
08:49:24-805363 INFO     Available Models: path="models\Stable-diffusion" items=15 time=0.02
08:49:24-863394 DEBUG    Load extensions
08:49:24-935810 INFO     Extension: script='extensions-builtin\Lora\scripts\lora_script.py'
                         [2;36m08:49:24-932834[0m[2;36m [0m[34mINFO    [0m Available LoRAs: [33mitems[0m=[1;36m146[0m
                         [33mfolders[0m=[1;36m4[0m
08:49:25-330586 INFO     Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using
                         sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
08:49:25-535358 DEBUG    Extensions init time: 0.67 sd-webui-agent-scheduler=0.35
                         stable-diffusion-webui-images-browser=0.20
08:49:25-548254 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2672 time=0.000
08:49:25-549744 DEBUG    Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719 time=0.000
08:49:25-552849 DEBUG    chaiNNer models: path="models\chaiNNer" defined=24 discovered=0 downloaded=8
08:49:25-554834 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="1x-ITF-SkinDiffDetail-Lite-v1"
                         path="models\ESRGAN\1x-ITF-SkinDiffDetail-Lite-v1.pth"
08:49:25-556322 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="4xNMKDSuperscale_4xNMKDSuperscale"
                         path="models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth"
08:49:25-557314 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="4x_NMKD-Siax_200k"
                         path="models\ESRGAN\4x_NMKD-Siax_200k.pth"
08:49:25-560290 INFO     Available Upscalers: items=56 downloaded=11 user=3 time=0.02 types=['None', 'Lanczos',
                         'Nearest', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
08:49:25-577426 INFO     Available Styles: folder="models\styles" items=288 time=0.02
08:49:25-580430 DEBUG    Creating UI
08:49:25-581890 DEBUG    UI themes available: type=Standard themes=12
08:49:25-582882 INFO     UI theme: type=Standard name="black-teal"
08:49:25-590322 DEBUG    UI theme: css="C:\ai\automatic\javascript\black-teal.css" base="sdnext.css" user="None"
08:49:25-592802 DEBUG    UI initialize: txt2img
08:49:25-675333 DEBUG    Networks: page='model' items=66 subfolders=2 tab=txt2img folders=['models\\Stable-diffusion',
                         'models\\Diffusers', 'models\\Reference'] list=0.06 thumb=0.01 desc=0.01 info=0.00 workers=8
                         sort=Default
08:49:25-691205 DEBUG    Networks: page='lora' items=146 subfolders=0 tab=txt2img folders=['models\\Lora',
                         'models\\LyCORIS'] list=0.06 thumb=0.02 desc=0.06 info=0.03 workers=8 sort=Default
08:49:25-722454 DEBUG    Networks: page='style' items=288 subfolders=1 tab=txt2img folders=['models\\styles', 'html']
                         list=0.06 thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
08:49:25-727910 DEBUG    Networks: page='embedding' items=13 subfolders=0 tab=txt2img folders=['models\\embeddings']
                         list=0.04 thumb=0.02 desc=0.01 info=0.00 workers=8 sort=Default
08:49:25-729893 DEBUG    Networks: page='vae' items=0 subfolders=0 tab=txt2img folders=['models\\VAE'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
08:49:25-984838 DEBUG    UI initialize: img2img
08:49:26-097858 DEBUG    UI initialize: control models=models\control
08:49:26-426245 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.000
08:49:26-537569 DEBUG    UI themes available: type=Standard themes=12
08:49:27-129792 DEBUG    Reading failed: C:\ai\automatic\html\extensions.json [Errno 2] No such file or directory:
                         'C:\\ai\\automatic\\html\\extensions.json'
08:49:27-131526 INFO     Extension list is empty: refresh required
08:49:27-799541 DEBUG    Extension list: processed=8 installed=8 enabled=6 disabled=2 visible=8 hidden=0
08:49:28-153684 DEBUG    Root paths: ['C:\\ai\\automatic']
08:49:28-232577 INFO     Local URL: http://127.0.0.1:7860/
08:49:28-233988 DEBUG    Gradio functions: registered=2453
08:49:28-235475 DEBUG    FastAPI middleware: ['Middleware', 'Middleware']
08:49:28-238451 DEBUG    Creating API
08:49:28-415551 INFO     [AgentScheduler] Task queue is empty
08:49:28-417011 INFO     [AgentScheduler] Registering APIs
08:49:28-869362 DEBUG    Scripts setup: ['IP Adapters:0.026', 'XYZ Grid:0.014', 'Face:0.013', 'AnimateDiff:0.007',
                         'CogVideoX:0.008', 'Ctrl-X:0.006', 'LUT Color grading:0.006', 'X/Y/Z Grid:0.013',
                         'Image-to-Video:0.007', 'Stable Video Diffusion:0.005']
08:49:28-870851 DEBUG    Model metadata: file="metadata.json" no changes
08:49:28-872863 DEBUG    Model requested: fn=<lambda>
08:49:28-873828 INFO     Load model: select="Diffusers\sayakpaul/flux.1-dev-nf4 [b054bc66ae]"
08:49:28-875316 DEBUG    Load model:
                         target="models\Diffusers\models--sayakpaul--flux.1-dev-nf4\snapshots\b054bc66ae1097b811848c3739
                         ecd673a864bda1" existing=False info=None
08:49:28-876804 DEBUG    Load model:
                         path="models\Diffusers\models--sayakpaul--flux.1-dev-nf4\snapshots\b054bc66ae1097b811848c3739ec
                         d673a864bda1"
08:49:28-877795 INFO     Autodetect model: detect="FLUX" class=FluxPipeline
                         file="models\Diffusers\models--sayakpaul--flux.1-dev-nf4\snapshots\b054bc66ae1097b811848c3739ec
                         d673a864bda1" size=0MB
08:49:28-879808 DEBUG    Load model: type=FLUX model="Diffusers\sayakpaul/flux.1-dev-nf4"
                         repo="sayakpaul/flux.1-dev-nf4" unet="None" t5="None" vae="None" quant=nf4 offload=model
                         dtype=torch.bfloat16
08:49:28-880800 DEBUG    HF login: no token provided
08:49:31-855655 DEBUG    GC: utilization={'gpu': 34, 'ram': 11, 'threshold': 80} gc={'collected': 2455, 'saved': 0.0}
                         before={'gpu': 8.12, 'ram': 7.27} after={'gpu': 8.12, 'ram': 7.27, 'retries': 0, 'oom': 0}
                         device=cuda fn=load_flux_nf4 time=0.21
08:49:32-407206 DEBUG    Load model: type=FLUX preloaded=['transformer']
Diffusers  4.51it/s █████████████ 100% 2/2 00:00 00:00 Loading checkpoint shards
Diffusers  6.90it/s ████████ 100% 7/7 00:01 00:00 Loading pipeline components...
08:49:33-771261 INFO     Load network: type=embeddings loaded=0 skipped=13 time=0.01
08:49:33-772749 DEBUG    Setting model: component=VAE slicing=True
08:49:33-773742 DEBUG    Setting model: component=VAE tiling=True
08:49:33-775229 DEBUG    Setting model: attention="Scaled-Dot-Product"
08:49:33-790110 DEBUG    Setting model: offload=model
08:49:35-936398 DEBUG    GC: utilization={'gpu': 7, 'ram': 11, 'threshold': 80} gc={'collected': 231, 'saved': 0.0}
                         before={'gpu': 1.64, 'ram': 7.33} after={'gpu': 1.64, 'ram': 7.33, 'retries': 0, 'oom': 0}
                         device=cuda fn=load_diffuser time=0.21
08:49:35-943159 INFO     Load model: time=6.85 load=4.88 move=1.94 native=1024 memory={'ram': {'used': 7.33, 'total':
                         63.92}, 'gpu': {'used': 1.64, 'total': 23.99}, 'retries': 0, 'oom': 0}
08:49:35-945611 DEBUG    Script callback init time: image_browser.py:ui_tabs=0.44 system-info.py:app_started=0.07
                         task_scheduler.py:app_started=0.47
08:49:35-947100 INFO     Startup time: 19.28 torch=4.55 gradio=1.31 diffusers=0.45 libraries=1.76 extensions=0.67
                         face-restore=0.06 ui-networks=0.41 ui-txt2img=0.08 ui-img2img=0.08 ui-control=0.15
                         ui-settings=0.25 ui-extensions=1.17 ui-defaults=0.27 launch=0.14 api=0.09 app-started=0.54
                         checkpoint=7.08
08:49:35-949083 DEBUG    Save: file="config.json" json=37 bytes=1517 time=0.004
08:49:35-952060 DEBUG    Unused settings: ['cross_attention_options']
08:49:59-957977 DEBUG    Server: alive=True jobs=1 requests=7 uptime=36 memory=7.33/63.92 backend=Backend.DIFFUSERS
                         state=idle
08:50:03-047256 WARNING  Ctrl-X: pipeline=FluxPipeline required=StableDiffusionXLPipeline
08:50:03-049736 INFO     XYZ grid start: images=3 grid=1 shape=3x1 cells=1 steps=60
08:50:03-052216 DEBUG    Load module: type=t5 path="Long-ViT-L-14-GmP-ft" module="text_encoder_2"
08:50:03-053704 DEBUG    HF login: no token provided
08:50:34-382146 DEBUG    XYZ grid apply text-encoder: "Long-ViT-L-14-GmP-ft"
08:50:34-388099 INFO     Base: class=FluxPipeline
08:50:34-389587 DEBUG    Sampler default FlowMatchEulerDiscreteScheduler: {'num_train_timesteps': 1000, 'shift': 3.0,
                         'use_dynamic_shifting': True, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256,
                         'max_image_seq_len': 4096}
08:50:34-412404 DEBUG    Torch generator: device=cuda seeds=[3852281433]
08:50:34-413485 DEBUG    Diffuser pipeline: FluxPipeline task=DiffusersTaskType.TEXT_2_IMAGE batch=1/1x1 set={'prompt':
                         1, 'guidance_scale': 6, 'num_inference_steps': 20, 'output_type': 'latent', 'width': 1024,
                         'height': 1024, 'parser': 'Fixed attention'}
Progress ?it/s                                              0% 0/20 00:06 ? Base
08:50:45-418002 ERROR    Processing: args={'prompt': ["This is a photograph"], 'guidance_scale': 6, 'generator': [<torch._C.Generator
                         object at 0x0000024121BA21B0>], 'callback_on_step_end': <function diffusers_callback at
                         0x0000023F8960EC00>, 'callback_on_step_end_tensor_inputs': ['latents'], 'num_inference_steps':
                         20, 'output_type': 'latent', 'width': 1024, 'height': 1024} Expected query, key, and value to
                         have the same dtype, but got query.dtype: struct c10::BFloat16 key.dtype: struct c10::BFloat16
                         and value.dtype: float instead.
08:50:45-421474 ERROR    Processing: RuntimeError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:95 in process_base                                                   │
│                                                                                                                      │
│    94 │   │   else:                                                                                                  │
│ ❱  95 │   │   │   output = shared.sd_model(**base_args)                                                              │
│    96 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\utils\_contextlib.py:116 in decorate_context                            │
│                                                                                                                      │
│   115 │   │   with ctx_factory():                                                                                    │
│ ❱ 116 │   │   │   return func(*args, **kwargs)                                                                       │
│   117                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\diffusers\pipelines\flux\pipeline_flux.py:719 in __call__                     │
│                                                                                                                      │
│   718 │   │   │   │                                                                                                  │
│ ❱ 719 │   │   │   │   noise_pred = self.transformer(                                                                 │
│   720 │   │   │   │   │   hidden_states=latents,                                                                     │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1553 in _wrapped_call_impl                         │
│                                                                                                                      │
│   1552 │   │   else:                                                                                                 │
│ ❱ 1553 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1554                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1562 in _call_impl                                 │
│                                                                                                                      │
│   1561 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1562 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1563                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\accelerate\hooks.py:169 in new_forward                                        │
│                                                                                                                      │
│   168 │   │   else:                                                                                                  │
│ ❱ 169 │   │   │   output = module._old_forward(*args, **kwargs)                                                      │
│   170 │   │   return module._hf_hook.post_forward(module, output)                                                    │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\diffusers\models\transformers\transformer_flux.py:495 in forward              │
│                                                                                                                      │
│   494 │   │   │   else:                                                                                              │
│ ❱ 495 │   │   │   │   encoder_hidden_states, hidden_states = block(                                                  │
│   496 │   │   │   │   │   hidden_states=hidden_states,                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1553 in _wrapped_call_impl                         │
│                                                                                                                      │
│   1552 │   │   else:                                                                                                 │
│ ❱ 1553 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1554                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1562 in _call_impl                                 │
│                                                                                                                      │
│   1561 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1562 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1563                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\diffusers\models\transformers\transformer_flux.py:172 in forward              │
│                                                                                                                      │
│   171 │   │   # Attention.                                                                                           │
│ ❱ 172 │   │   attn_output, context_attn_output = self.attn(                                                          │
│   173 │   │   │   hidden_states=norm_hidden_states,                                                                  │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1553 in _wrapped_call_impl                         │
│                                                                                                                      │
│   1552 │   │   else:                                                                                                 │
│ ❱ 1553 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1554                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1562 in _call_impl                                 │
│                                                                                                                      │
│   1561 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1562 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1563                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\diffusers\models\attention_processor.py:490 in forward                        │
│                                                                                                                      │
│    489 │   │                                                                                                         │
│ ❱  490 │   │   return self.processor(                                                                                │
│    491 │   │   │   self,                                                                                             │
│                                                                                                                      │
│ C:\ai\automatic\venv\Lib\site-packages\diffusers\models\attention_processor.py:1765 in __call__                      │
│                                                                                                                      │
│   1764 │   │                                                                                                         │
│ ❱ 1765 │   │   hidden_states = F.scaled_dot_product_attention(query, key, value, dropout_p=0.0, is_causal=False)     │
│   1766 │   │   hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim)          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::BFloat16 key.dtype: struct c10::BFloat16 and value.dtype: float instead.
08:50:46-453169 DEBUG    GC: utilization={'gpu': 91, 'ram': 41, 'threshold': 80} gc={'collected': 139, 'saved': 13.36}
                         before={'gpu': 21.76, 'ram': 25.89} after={'gpu': 8.4, 'ram': 25.89, 'retries': 0, 'oom': 0}
                         device=cuda fn=nextjob time=0.31
08:50:46-465570 INFO     Processed: images=0 its=0.00 time=12.07 timers={'gc': 0.31, 'args': 0.02, 'process': 12.04}
                         memory={'ram': {'used': 25.89, 'total': 63.92}, 'gpu': {'used': 8.4, 'total': 23.99},
                         'retries': 0, 'oom': 0}
08:50:46-681825 INFO     XYZ grid complete: images=3 size=(3072, 1298) time=43.42 save=0.21
08:50:46-683810 DEBUG    Load module: type=t5 path="None" module="text_encoder_2"
08:50:46-684801 DEBUG    HF login: no token provided
08:50:46-685297 DEBUG    Process interrupted: 1/1
08:50:46-686786 INFO     Processed: images=0 its=0.00 time=0.00 timers={'gc': 0.31, 'args': 0.02, 'process': 12.04,
                         'post': 0.23} memory={'ram': {'used': 25.91, 'total': 63.92}, 'gpu': {'used': 8.4, 'total':
                         23.99}, 'retries': 0, 'oom': 0}
Backend

Diffusers
UI

Standard
Branch

Dev
Model

Other
Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue
vladmandic / automatic