[Issue]: flux.1-dev-qint8 not working on windows

Issue Description

Hi. I have a 3090 with 24GB of ram so i didn't bother with the shared fix updated in the webui the sd.next (nov 2 update) generation "works" on other models (i have a lot of them)
Version Platform Description

sd.next Version: app=sd.next updated=2024-11-02 hash=65ddc611 branch=master windows 11 23H2 Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64;rv:131.0) Gecko/20100101 Firefox/131.0 python 3.11
Relevant log output

Using VENV: D:\opt\automatic\venv
10:57:05-561963 INFO     Starting SD.Next
10:57:05-564960 INFO     Logger: file="D:\opt\automatic\sdnext.log" level=DEBUG size=65 mode=create
10:57:05-566962 INFO     Python: version=3.11.9 platform=Windows bin="D:\opt\automatic\venv\Scripts\python.exe"
                         venv="D:\opt\automatic\venv"
10:57:05-752487 INFO     Version: app=sd.next updated=2024-11-02 hash=65ddc611 branch=master
                         url=https://github.com/vladmandic/automatic/tree/master ui=main
10:57:12-513678 INFO     Platform: arch=AMD64 cpu=AMD64 Family 25 Model 33 Stepping 0, AuthenticAMD system=Windows
                         release=Windows-10-10.0.22631-SP0 python=3.11.9
10:57:12-514677 INFO     Args: ['--debug', '--use-xformers', '--use-cuda']
10:57:12-516678 DEBUG    Setting environment tuning
10:57:12-517680 DEBUG    Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
10:57:12-528678 DEBUG    Torch overrides: cuda=True rocm=False ipex=False directml=False openvino=False zluda=False
10:57:12-531678 INFO     CUDA: nVidia toolkit detected
10:57:15-716283 INFO     Verifying requirements
10:57:15-720283 INFO     Verifying packages
10:57:15-763801 DEBUG    Timestamp repository update time: Sat Nov  2 08:53:12 2024
10:57:15-764802 INFO     Startup: standard
10:57:15-765800 INFO     Verifying submodules
10:57:17-999215 DEBUG    Git submodule: extensions-builtin/sd-extension-chainner / main
10:57:18-074738 DEBUG    Git submodule: extensions-builtin/sd-extension-system-info / main
10:57:18-145739 DEBUG    Git submodule: extensions-builtin/sd-webui-agent-scheduler / main
10:57:18-259269 DEBUG    Git detached head detected: folder="extensions-builtin/sdnext-modernui" reattach=main
10:57:18-260774 DEBUG    Git submodule: extensions-builtin/sdnext-modernui / main
10:57:18-334782 DEBUG    Git submodule: extensions-builtin/stable-diffusion-webui-rembg / master
10:57:18-410301 DEBUG    Git submodule: modules/k-diffusion / master
10:57:18-518819 DEBUG    Git detached head detected: folder="wiki" reattach=master
10:57:18-520822 DEBUG    Git submodule: wiki / master
10:57:18-565340 DEBUG    Register paths
10:57:18-666858 DEBUG    Installed packages: 185
10:57:18-668863 DEBUG    Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg']
10:57:18-967416 DEBUG    Extension installer: D:\opt\automatic\extensions-builtin\sd-webui-agent-scheduler\install.py
10:57:20-937267 DEBUG    Extension installer:
                         D:\opt\automatic\extensions-builtin\stable-diffusion-webui-rembg\install.py
10:57:26-582849 DEBUG    Extensions all: []
10:57:26-584847 INFO     Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg']
10:57:26-585847 INFO     Verifying requirements
10:57:26-586847 DEBUG    Setup complete without errors: 1730563047
10:57:26-590847 DEBUG    Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
10:57:26-592847 INFO     Command line args: ['--debug', '--use-xformers', '--use-cuda'] use_cuda=True use_xformers=True
                         debug=True
10:57:26-593848 DEBUG    Env flags: []
10:57:26-594847 DEBUG    Starting module: <module 'webui' from 'D:\\opt\\automatic\\webui.py'>
10:57:30-846669 INFO     System packages: {'torch': '2.5.1+cu124', 'diffusers': '0.32.0.dev0', 'gradio': '3.43.2',
                         'transformers': '4.46.1', 'accelerate': '1.0.1'}
10:57:31-493294 DEBUG    Huggingface cache: folder="C:\Users\syllable\.cache\huggingface\hub"
10:57:31-617808 INFO     Device detect: memory=24.0 optimization=none
10:57:31-619811 DEBUG    Read: file="config.json" json=36 bytes=1585 time=0.000
10:57:31-621809 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="xFormers" mode=no_grad
10:57:31-623810 DEBUG    Read: file="html\reference.json" json=59 bytes=31585 time=0.000
10:57:31-683325 INFO     Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16
                         unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upscast=False
                         deterministic=False test-fp16=True test-bf16=True optimization="xFormers"
10:57:32-037882 DEBUG    ONNX: version=1.20.0 provider=CUDAExecutionProvider, available=['AzureExecutionProvider',
                         'CPUExecutionProvider']
10:57:32-147404 INFO     Device: device=NVIDIA GeForce RTX 3090 n=1 arch=sm_90 capability=(8, 6) cuda=12.4 cudnn=90100
                         driver=552.22
10:57:32-272442 DEBUG    Importing LDM
10:57:32-286443 DEBUG    Entering start sequence
10:57:32-290442 DEBUG    Initializing
10:57:32-323443 INFO     Available VAEs: path="models\VAE" items=0
10:57:32-324442 INFO     Available UNets: path="models\UNET" items=0
10:57:32-326443 INFO     Available TEs: path="models\Text-encoder" items=0
10:57:32-328444 INFO     Disabled extensions: []
10:57:32-331443 DEBUG    Read: file="cache.json" json=2 bytes=3613 time=0.000
10:57:32-333443 DEBUG    Read: file="metadata.json" json=69 bytes=129820 time=0.000
10:57:32-342443 DEBUG    Scanning diffusers cache: folder="models\Diffusers" items=1 time=0.00
10:57:32-343445 INFO     Available Models: path="models\Stable-diffusion" items=22 time=0.01
10:57:32-424961 INFO     Available Yolo: path="models\yolo" items=6 downloaded=1
10:57:32-426963 DEBUG    Load extensions
10:57:32-504482 INFO     Available LoRAs: path="models\Lora" items=48 folders=3 time=0.01
10:57:33-077634 INFO     Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using
                         sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
10:57:33-092630 DEBUG    Extensions init time: 0.66 Lora=0.20 sd-webui-agent-scheduler=0.35
10:57:33-111633 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2672 time=0.001
10:57:33-113631 DEBUG    Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719 time=0.000
10:57:33-115630 DEBUG    chaiNNer models: path="models\chaiNNer" defined=24 discovered=0 downloaded=0
10:57:33-121631 INFO     Available Upscalers: items=53 downloaded=0 user=0 time=0.03 types=['None', 'Lanczos',
                         'Nearest', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
10:57:33-148635 INFO     Available Styles: folder="models\styles" items=288 time=0.03
10:57:33-152643 DEBUG    Creating UI
10:57:33-154642 DEBUG    UI themes available: type=Modern themes=32
10:57:33-155641 ERROR    UI theme invalid: type=Modern theme="black-teal" available=['Aptro-AmberGlow',
                         'BrknSoul-Amstrad', 'CasanovaSan-CassyTheme', 'Default', 'Eoan-Alpha', 'Eoan-Cyan',
                         'Eoan-Green', 'Eoan-Light', 'Eoan-Minimal', 'Eoan-Moonlight', 'Eoan-Orange',
                         'Eoan-OrangeMinimal', 'Eoan-OrangeMoonlight', 'Eoan-Quad', 'Eoan-Subdued', 'Eoan-Tron',
                         'Eoan-Yellow', 'IlluZn-Domination', 'IlluZn-LavenderZest', 'IlluZn-Superuser',
                         'IlluZn-VintageBeige', 'IlluZn-WhisperingSlate', 'QS-Candy', 'QS-CupOfTea', 'QS-Midnight',
                         'QS-Sunset', 'QS-SweetClouds', 'QS-Teal', 'QS-TwilightSands', 'QS-WhiteRabbit', 'Vlad-Default',
                         'Vlad-Flat']
10:57:33-157641 INFO     UI theme: type=Modern name="Default"
10:57:33-171157 DEBUG    UI theme: css="extensions-builtin\sdnext-modernui\themes\Default.css" base="base.css"
                         user="None"
10:57:33-178158 DEBUG    UI initialize: txt2img
10:57:33-462200 DEBUG    Networks: page='model' items=80 subfolders=2 tab=txt2img folders=['models\\Stable-diffusion',
                         'models\\Diffusers', 'models\\Reference'] list=0.27 thumb=0.01 desc=0.03 info=1.33 workers=8
                         sort=Default
10:57:33-466717 DEBUG    Networks: page='lora' items=48 subfolders=0 tab=txt2img folders=['models\\Lora',
                         'models\\LyCORIS'] list=0.26 thumb=0.01 desc=0.04 info=0.00 workers=8 sort=Default
10:57:33-475717 DEBUG    Networks: page='style' items=288 subfolders=1 tab=txt2img folders=['models\\styles', 'html']
                         list=0.27 thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
10:57:33-479718 DEBUG    Networks: page='embedding' items=0 subfolders=0 tab=txt2img folders=['models\\embeddings']
                         list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
10:57:33-481717 DEBUG    Networks: page='vae' items=0 subfolders=0 tab=txt2img folders=['models\\VAE'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
10:57:33-483718 DEBUG    Networks: page='history' items=0 subfolders=0 tab=txt2img folders=[] list=0.00 thumb=0.00
                         desc=0.00 info=0.00 workers=8 sort=Default
10:57:33-699764 DEBUG    UI initialize: img2img
10:57:33-819292 DEBUG    UI initialize: control models=models\control
10:57:34-134856 DEBUG    Reading failed: ui-config.json [Errno 2] No such file or directory: 'ui-config.json'
10:57:34-262391 DEBUG    UI themes available: type=Modern themes=32
10:57:34-405435 DEBUG    Reading failed: D:\opt\automatic\html\extensions.json [Errno 2] No such file or directory:
                         'D:\\opt\\automatic\\html\\extensions.json'
10:57:34-407434 INFO     Extension list is empty: refresh required
10:57:34-900046 DEBUG    Extension list: processed=6 installed=6 enabled=6 disabled=0 visible=6 hidden=0
10:57:35-204605 DEBUG    Save: file="ui-config.json" json=0 bytes=2 time=0.001
10:57:35-251610 DEBUG    Root paths: ['D:\\opt\\automatic']
10:57:35-329472 INFO     Local URL: http://127.0.0.1:7860/
10:57:35-330472 DEBUG    Gradio functions: registered=1957
10:57:35-332469 DEBUG    FastAPI middleware: ['Middleware', 'Middleware']
10:57:35-335469 DEBUG    Creating API
10:57:35-491515 INFO     [AgentScheduler] Task queue is empty
10:57:35-492515 INFO     [AgentScheduler] Registering APIs
10:57:35-728553 DEBUG    Scripts setup: ['IP Adapters:0.026', 'XYZ Grid:0.026', 'Face:0.013', 'AnimateDiff:0.008',
                         'CogVideoX:0.007', 'Ctrl-X:0.007', 'K-Diffusion:0.108', 'LUT Color grading:0.006', 'Prompt
                         enhance:0.005', 'Image-to-Video:0.007']
10:57:35-789064 DEBUG    Model metadata: file="metadata.json" no changes
10:57:35-832065 DEBUG    Model requested: fn=run:<lambda>
10:57:35-834065 INFO     Load model: select="Diffusers\Disty0/FLUX.1-dev-qint8 [fd65655d4d]"
10:57:35-836065 INFO     Autodetect model: detect="FLUX" class=FluxPipeline
                         file="models\Diffusers\models--Disty0--FLUX.1-dev-qint8\snapshots\fd65655d4d82276350aa5cd93b454
                         a112f4a616e" size=0MB
10:57:35-838065 DEBUG    Load model: type=FLUX model="Diffusers\Disty0/FLUX.1-dev-qint8" repo="Disty0/FLUX.1-dev-qint8"
                         unet="None" te="None" vae="Automatic" quant=qint8 offload=none dtype=torch.bfloat16
10:57:36-086625 INFO     HF login: token="C:\Users\syllable\.cache\huggingface\token"
10:57:36-087624 DEBUG    Quantization: type=quanto fn=load_flux:load_flux_quanto
10:57:38-967622 INFO     MOTD: N/A
10:57:39-512215 DEBUG    Load model: type=FLUX preloaded=['transformer', 'text_encoder_2']
Diffusers  9.39it/s ████████ 100% 7/7 00:00 00:00 Loading pipeline components...
10:57:40-478396 INFO     Load network: type=embeddings loaded=0 skipped=0 time=0.00
10:57:40-480395 DEBUG    Setting model: component=VAE slicing=True
10:57:40-481397 DEBUG    Setting model: attention="xFormers"
10:57:40-529399 DEBUG    Setting model: offload=none
10:57:41-591080 DEBUG    UI themes available: type=Modern themes=32
10:57:43-090824 INFO     Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64;
                         rv:131.0) Gecko/20100101 Firefox/131.0
10:57:49-953111 DEBUG    Model move: device=cuda class=FluxPipeline accelerate=False
                         fn=reload_model_weights:load_diffuser time=9.42
10:57:50-257661 DEBUG    GC: utilization={'gpu': 74, 'ram': 1, 'threshold': 80} gc={'collected': 567, 'saved': 0.0}
                         before={'gpu': 17.68, 'ram': 1.13} after={'gpu': 17.68, 'ram': 1.13, 'retries': 0, 'oom': 0}
                         device=cuda fn=reload_model_weights:load_diffuser time=0.3
10:57:50-265661 INFO     Load model: time=14.12 load=4.64 move=9.43 native=1024 memory={'ram': {'used': 1.14, 'total':
                         127.92}, 'gpu': {'used': 17.68, 'total': 24.0}, 'retries': 0, 'oom': 0}
10:57:50-270174 DEBUG    Script callback init time: system-info.py:app_started=0.05 task_scheduler.py:app_started=0.25
10:57:50-271173 DEBUG    Save: file="config.json" json=37 bytes=1561 time=0.002
10:57:50-272175 INFO     Startup time: 23.67 torch=1.94 gradio=1.27 diffusers=0.14 libraries=2.29 extensions=0.66
                         detailer=0.08 ui-networks=0.45 ui-txt2img=0.19 ui-img2img=0.08 ui-control=0.14 ui-settings=0.26
                         ui-extensions=0.53 ui-defaults=0.29 launch=0.13 api=0.09 app-started=0.31 checkpoint=14.54
10:58:00-278709 DEBUG    Server: alive=True jobs=1 requests=107 uptime=29 memory=1.14/127.92 backend=Backend.DIFFUSERS
                         state=idle

                         `
                         `
10:58:52-192292 DEBUG    Control process unit: i=1 process=None
10:58:52-200291 INFO     Base: class=FluxPipeline
10:58:52-202291 DEBUG    Sampler: sampler=default class=FlowMatchEulerDiscreteScheduler: {'num_train_timesteps': 1000,
                         'shift': 3.0, 'use_dynamic_shifting': True, 'base_shift': 0.5, 'max_shift': 1.15,
                         'base_image_seq_len': 256, 'max_image_seq_len': 4096}
10:58:52-741394 DEBUG    Torch generator: device=cuda seeds=[4202781822]
10:58:52-743395 DEBUG    Diffuser pipeline: FluxPipeline task=DiffusersTaskType.TEXT_2_IMAGE batch=1/1x1
                         set={'prompt_embeds': torch.Size([1, 36, 4096]), 'pooled_prompt_embeds': torch.Size([1, 768]),
                         'guidance_scale': 6, 'num_inference_steps': 20, 'output_type': 'latent', 'width': 1024,
                         'height': 1024, 'parser': 'Full parser'}
Progress ?it/s                                              0% 0/20 00:00 ? Base
10:58:52-831915 ERROR    Processing: args={'prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([1, 36, 4096])',
                         'pooled_prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([1, 768])', 'guidance_scale': 6,
                         'generator': [<torch._C.Generator object at 0x000002708B098F50>], 'callback_on_step_end':
                         <function diffusers_callback at 0x00000270B7BA5580>, 'callback_on_step_end_tensor_inputs':
                         ['latents'], 'num_inference_steps': 20, 'output_type': 'latent', 'width': 1024, 'height': 1024}
                         not enough values to unpack (expected 2, got 1)
10:58:52-834915 ERROR    Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ D:\opt\automatic\modules\processing_diffusers.py:99 in process_base                                                  │
│                                                                                                                      │
│    98 │   │   else:                                                                                                  │
│ ❱  99 │   │   │   output = shared.sd_model(**base_args)                                                              │
│   100 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\torch\utils\_contextlib.py:116 in decorate_context                           │
│                                                                                                                      │
│   115 │   │   with ctx_factory():                                                                                    │
│ ❱ 116 │   │   │   return func(*args, **kwargs)                                                                       │
│   117                                                                                                                │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\diffusers\pipelines\flux\pipeline_flux.py:732 in __call__                    │
│                                                                                                                      │
│   731 │   │   │   │                                                                                                  │
│ ❱ 732 │   │   │   │   noise_pred = self.transformer(                                                                 │
│   733 │   │   │   │   │   hidden_states=latents,                                                                     │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                        │
│                                                                                                                      │
│   1735 │   │   else:                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1737                                                                                                               │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                │
│                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1748                                                                                                               │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\diffusers\models\transformers\transformer_flux.py:500 in forward             │
│                                                                                                                      │
│   499 │   │   │   else:                                                                                              │
│ ❱ 500 │   │   │   │   encoder_hidden_states, hidden_states = block(                                                  │
│   501 │   │   │   │   │   hidden_states=hidden_states,                                                               │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                        │
│                                                                                                                      │
│   1735 │   │   else:                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1737                                                                                                               │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                │
│                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1748                                                                                                               │
│                                                                                                                      │
│ D:\opt\automatic\venv\Lib\site-packages\diffusers\models\transformers\transformer_flux.py:175 in forward             │
│                                                                                                                      │
│   174 │   │   # Attention.                                                                                           │
│ ❱ 175 │   │   attn_output, context_attn_output = self.attn(                                                          │
│   176 │   │   │   hidden_states=norm_hidden_states,                                                                  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: not enough values to unpack (expected 2, got 1)
10:58:53-285503 INFO     Processed: images=0 its=0.00 time=1.09 timers={'encode': 0.52, 'args': 0.6, 'process': 0.48}
                         memory={'ram': {'used': 1.74, 'total': 127.92}, 'gpu': {'used': 18.5, 'total': 24.0},
                         'retries': 0, 'oom': 0}
10:58:53-287503 INFO     Control: pipeline units=0 process=0 time=1.10 init=0.00 proc=0.00 ctrl=1.09 outputs=0
11:00:00-304062 DEBUG    Server: alive=True jobs=1 requests=137 uptime=149 memory=1.74/127.92 backend=Backend.DIFFUSERS
                         state=idle
Backend

Diffusers
UI

Standard
Branch

Master
Model

Other
Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue
vladmandic / automatic

[Issue]: flux.1-dev-qint8 not working on windows #3550