[Issue]: ModuleNotFoundError: No module named 'gguf' (gguf not auto-installing)

Issue Description

Downloaded this UNET, put it in the models\unet folder: https://civitai.com/models/843551?modelVersionId=943939
Loaded Flux-qint4
Attempted to load the UNET above from the quicksettings
Got "no module named gguf" error (as per Flux wiki I was expecting gguf to be auto-installed upon first use)
Restarted the server and attempted the same repeatedly
Version Platform Description

11:02:38-418429 INFO Python: version=3.11.9 platform=Windows bin="C:\ai\automatic\venv\Scripts\python.exe" venv="C:\ai\automatic\venv" 11:02:38-663869 INFO Version: app=sd.next updated=2024-10-12 hash=0c54c235 branch=dev url=https://github.com/vladmandic/automatic/tree/dev ui=dev 11:02:39-529725 INFO Repository latest available bb6a40da131269554f026379ff97c5480b40e64e 2024-10-13T01:26:43Z 11:02:39-545333 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows release=Windows-10-10.0.22631-SP0 python=3.11.9
Relevant log output

PS C:\ai\automatic> .\webui.bat --debug
Using VENV: C:\ai\automatic\venv
11:02:38-414469 INFO     Starting SD.Next
11:02:38-417432 INFO     Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create
11:02:38-418429 INFO     Python: version=3.11.9 platform=Windows bin="C:\ai\automatic\venv\Scripts\python.exe"
                         venv="C:\ai\automatic\venv"
11:02:38-663869 INFO     Version: app=sd.next updated=2024-10-12 hash=0c54c235 branch=dev
                         url=https://github.com/vladmandic/automatic/tree/dev ui=dev
11:02:39-529725 INFO     Repository latest available bb6a40da131269554f026379ff97c5480b40e64e 2024-10-13T01:26:43Z
11:02:39-545333 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
                         release=Windows-10-10.0.22631-SP0 python=3.11.9
11:02:39-547063 DEBUG    Setting environment tuning
11:02:39-548096 DEBUG    Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
11:02:39-560298 DEBUG    Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False zluda=False
11:02:39-572260 INFO     CUDA: nVidia toolkit detected
11:02:39-698298 WARNING  Modified files: ['models/Reference/playgroundai--playground-v2-1024px-aesthetic.jpg']
11:02:39-800026 INFO     Verifying requirements
11:02:39-804528 INFO     Verifying packages
11:02:39-848474 DEBUG    Timestamp repository update time: Sat Oct 12 22:43:34 2024
11:02:39-849472 INFO     Startup: standard
11:02:39-850469 INFO     Verifying submodules
11:02:43-238581 DEBUG    Git detached head detected: folder="extensions-builtin/sd-extension-chainner" reattach=main
11:02:43-239408 DEBUG    Git submodule: extensions-builtin/sd-extension-chainner / main
11:02:43-368039 DEBUG    Git detached head detected: folder="extensions-builtin/sd-extension-system-info" reattach=main
11:02:43-369064 DEBUG    Git submodule: extensions-builtin/sd-extension-system-info / main
11:02:43-500517 DEBUG    Git detached head detected: folder="extensions-builtin/sd-webui-agent-scheduler" reattach=main
11:02:43-501514 DEBUG    Git submodule: extensions-builtin/sd-webui-agent-scheduler / main
11:02:43-675048 DEBUG    Git detached head detected: folder="extensions-builtin/sdnext-modernui" reattach=dev
11:02:43-676073 DEBUG    Git submodule: extensions-builtin/sdnext-modernui / dev
11:02:43-832907 DEBUG    Git detached head detected: folder="extensions-builtin/stable-diffusion-webui-rembg"
                         reattach=master
11:02:43-834633 DEBUG    Git submodule: extensions-builtin/stable-diffusion-webui-rembg / master
11:02:43-964212 DEBUG    Git detached head detected: folder="modules/k-diffusion" reattach=master
11:02:43-965210 DEBUG    Git submodule: modules/k-diffusion / master
11:02:44-091883 DEBUG    Git detached head detected: folder="wiki" reattach=master
11:02:44-092880 DEBUG    Git submodule: wiki / master
11:02:44-187395 DEBUG    Register paths
11:02:44-288157 DEBUG    Installed packages: 221
11:02:44-289610 DEBUG    Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sdnext-modernui',
                         'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg']
11:02:44-493021 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\sd-extension-system-info\install.py
11:02:44-692460 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\sd-webui-agent-scheduler\install.py
11:02:47-493119 DEBUG    Extension force: name="sd-webui-controlnet" commit=ecd33eb
11:02:47-563786 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\sd-webui-controlnet\install.py
11:03:01-397614 DEBUG    Extension installer:
                         C:\ai\automatic\extensions-builtin\stable-diffusion-webui-images-browser\install.py
11:03:04-081776 DEBUG    Extension installer: C:\ai\automatic\extensions-builtin\stable-diffusion-webui-rembg\install.py
11:03:13-731683 DEBUG    Extensions all: []
11:03:13-732683 INFO     Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sdnext-modernui',
                         'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg']
11:03:13-733682 INFO     Verifying requirements
11:03:13-734679 DEBUG    Setup complete without errors: 1728806594
11:03:13-743601 DEBUG    Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
11:03:13-745596 DEBUG    Starting module: <module 'webui' from 'C:\\ai\\automatic\\webui.py'>
11:03:13-746593 INFO     Command line args: ['--debug'] debug=True
11:03:13-747591 DEBUG    Env flags: []
11:03:21-852703 INFO     System packages: {'torch': '2.4.1+cu124', 'diffusers': '0.31.0.dev0', 'gradio': '3.43.2',
                         'transformers': '4.45.2', 'accelerate': '1.0.0'}
11:03:22-458083 DEBUG    Huggingface cache: folder="C:\Users\sebas\.cache\huggingface\hub"
11:03:22-560361 INFO     Device detect: memory=24.0 optimization=none
11:03:22-582873 DEBUG    Read: file="config.json" json=34 bytes=1564 time=0.000
11:03:22-585868 DEBUG    Unknown settings: ['cross_attention_options', 'face_restoration_unload']
11:03:22-588860 INFO     Engine: backend=Backend.DIFFUSERS compute=None device=cuda attention="Scaled-Dot-Product"
                         mode=no_grad
11:03:22-591364 DEBUG    Read: file="html\reference.json" json=52 bytes=29118 time=0.000
11:03:22-658695 INFO     Torch parameters: backend=cuda device=cuda config=BF16 dtype=torch.bfloat16 vae=torch.bfloat16
                         unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upscast=False
                         deterministic=False test-fp16=True test-bf16=True optimization="Scaled-Dot-Product"
11:03:23-149948 DEBUG    ONNX: version=1.19.2 provider=CPUExecutionProvider, available=['TensorrtExecutionProvider',
                         'CUDAExecutionProvider', 'CPUExecutionProvider']
11:03:23-342933 INFO     Device: device=NVIDIA GeForce RTX 4090 n=1 arch=sm_90 capability=(8, 9) cuda=12.4 cudnn=90100
                         driver=561.09
11:03:23-435974 DEBUG    Importing LDM
11:03:23-456384 DEBUG    Entering start sequence
11:03:23-459378 DEBUG    Initializing
11:03:23-498889 INFO     Available VAEs: path="models\VAE" items=0
11:03:23-499886 INFO     Available UNets: path="models\UNET" items=3
11:03:23-501391 INFO     Available TEs: path="models\Text-encoder" items=4
11:03:23-503421 INFO     Disabled extensions: ['sd-webui-controlnet', 'sdnext-modernui']
11:03:23-505417 DEBUG    Read: file="cache.json" json=2 bytes=11134 time=0.000
11:03:23-517407 DEBUG    Read: file="metadata.json" json=713 bytes=2528683 time=0.010
11:03:23-528688 DEBUG    Scanning diffusers cache: folder="models\Diffusers" items=3 time=0.00
11:03:23-529685 INFO     Available Models: path="models\Stable-diffusion" items=15 time=0.03
11:03:23-611464 INFO     Available Yolo: path="models\yolo items=6 downloaded=3
11:03:23-611464 DEBUG    Load extensions
11:03:23-700973 INFO     Extension: script='extensions-builtin\Lora\scripts\lora_script.py'
                         [2;36m11:03:23-697224[0m[2;36m [0m[34mINFO    [0m Available LoRAs: [33mitems[0m=[1;36m145[0m
                         [33mfolders[0m=[1;36m5[0m
11:03:24-174284 INFO     Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using
                         sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
11:03:24-399262 DEBUG    Extensions init time: 0.79 sd-webui-agent-scheduler=0.41
                         stable-diffusion-webui-images-browser=0.22
11:03:24-416752 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2672 time=0.001
11:03:24-418373 DEBUG    Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719 time=0.001
11:03:24-420370 DEBUG    chaiNNer models: path="models\chaiNNer" defined=24 discovered=0 downloaded=8
11:03:24-421871 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="1x-ITF-SkinDiffDetail-Lite-v1"
                         path="models\ESRGAN\1x-ITF-SkinDiffDetail-Lite-v1.pth"
11:03:24-422871 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="4xNMKDSuperscale_4xNMKDSuperscale"
                         path="models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth"
11:03:24-423870 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="4x_NMKD-Siax_200k"
                         path="models\ESRGAN\4x_NMKD-Siax_200k.pth"
11:03:24-426861 INFO     Available Upscalers: items=56 downloaded=11 user=3 time=0.03 types=['None', 'Lanczos',
                         'Nearest', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
11:03:24-446121 INFO     Available Styles: folder="models\styles" items=288 time=0.02
11:03:24-451267 DEBUG    Creating UI
11:03:24-452273 DEBUG    UI themes available: type=Standard themes=12
11:03:24-453272 INFO     UI theme: type=Standard name="black-teal"
11:03:24-460253 DEBUG    UI theme: css="C:\ai\automatic\javascript\black-teal.css" base="sdnext.css" user="None"
11:03:24-463620 DEBUG    UI initialize: txt2img
11:03:24-543871 DEBUG    Networks: page='model' items=66 subfolders=2 tab=txt2img folders=['models\\Stable-diffusion',
                         'models\\Diffusers', 'models\\Reference'] list=0.06 thumb=0.01 desc=0.00 info=0.00 workers=8
                         sort=Default
11:03:24-549855 DEBUG    Networks: page='lora' items=145 subfolders=0 tab=txt2img folders=['models\\Lora',
                         'models\\LyCORIS'] list=0.06 thumb=0.02 desc=0.07 info=0.03 workers=8 sort=Default
11:03:24-558356 DEBUG    Networks: page='style' items=288 subfolders=1 tab=txt2img folders=['models\\styles', 'html']
                         list=0.06 thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
11:03:24-563639 DEBUG    Networks: page='embedding' items=13 subfolders=0 tab=txt2img folders=['models\\embeddings']
                         list=0.04 thumb=0.02 desc=0.01 info=0.00 workers=8 sort=Default
11:03:24-566629 DEBUG    Networks: page='vae' items=0 subfolders=0 tab=txt2img folders=['models\\VAE'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=8 sort=Default
11:03:24-568624 DEBUG    Networks: page='history' items=0 subfolders=0 tab=txt2img folders=[] list=0.00 thumb=0.00
                         desc=0.00 info=0.00 workers=8 sort=Default
11:03:24-676867 DEBUG    UI initialize: img2img
11:03:24-948480 DEBUG    UI initialize: control models=models\control
11:03:25-272184 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.000
11:03:25-387713 DEBUG    UI themes available: type=Standard themes=12
11:03:25-989227 DEBUG    Reading failed: C:\ai\automatic\html\extensions.json [Errno 2] No such file or directory:
                         'C:\\ai\\automatic\\html\\extensions.json'
11:03:25-990224 INFO     Extension list is empty: refresh required
11:03:26-611302 DEBUG    Extension list: processed=8 installed=8 enabled=6 disabled=2 visible=8 hidden=0
11:03:26-943874 DEBUG    Root paths: ['C:\\ai\\automatic']
11:03:27-030102 INFO     Local URL: http://127.0.0.1:7860/
11:03:27-031099 DEBUG    Gradio functions: registered=2500
11:03:27-033415 DEBUG    FastAPI middleware: ['Middleware', 'Middleware']
11:03:27-036410 DEBUG    Creating API
11:03:27-210249 INFO     [AgentScheduler] Task queue is empty
11:03:27-211454 INFO     [AgentScheduler] Registering APIs
11:03:27-539546 DEBUG    Scripts setup: ['IP Adapters:0.027', 'XYZ Grid:0.028', 'Face:0.013', 'AnimateDiff:0.007',
                         'CogVideoX:0.159', 'Ctrl-X:0.006', 'LUT Color grading:0.007', 'Prompt enhance:0.005',
                         'Image-to-Video:0.007']
11:03:27-541260 DEBUG    Model metadata: file="metadata.json" no changes
11:03:27-542288 DEBUG    Model requested: fn=run:<lambda>
11:03:27-543284 INFO     Load model: select="Diffusers\Disty0/FLUX.1-dev-qint4 [82811df42b]"
11:03:27-545279 DEBUG    Load model:
                         target="models\Diffusers\models--Disty0--FLUX.1-dev-qint4\snapshots\82811df42b556a1153b971d8375
                         d5170c306a6eb" existing=False info=None
11:03:27-546285 DEBUG    Load model:
                         path="models\Diffusers\models--Disty0--FLUX.1-dev-qint4\snapshots\82811df42b556a1153b971d8375d5
                         170c306a6eb"
11:03:27-547247 INFO     Autodetect model: detect="FLUX" class=FluxPipeline
                         file="models\Diffusers\models--Disty0--FLUX.1-dev-qint4\snapshots\82811df42b556a1153b971d8375d5
                         170c306a6eb" size=0MB
11:03:27-551236 DEBUG    Load model: type=FLUX model="Diffusers\Disty0/FLUX.1-dev-qint4" repo="Disty0/FLUX.1-dev-qint4"
                         unet="None" t5="None" vae="None" quant=qint4 offload=none dtype=torch.bfloat16
11:03:27-552740 DEBUG    HF login: no token provided
11:03:27-552740 DEBUG    Quantization: module=quanto fn=load_flux:load_flux_quanto
11:04:20-473955 DEBUG    Load model: type=FLUX preloaded=['transformer', 'text_encoder_2']
Diffusers 13.10it/s ████████ 100% 7/7 00:00 00:00 Loading pipeline components...
11:04:21-627071 INFO     Load network: type=embeddings loaded=0 skipped=13 time=0.02
11:04:21-628068 DEBUG    Setting model: component=VAE slicing=True
11:04:21-629066 DEBUG    Setting model: attention="Scaled-Dot-Product"
11:04:21-646020 DEBUG    Setting model: offload=none
11:04:25-204660 DEBUG    GC: utilization={'gpu': 45, 'ram': 8, 'threshold': 80} gc={'collected': 1129, 'saved': 0.0}
                         before={'gpu': 10.83, 'ram': 5.13} after={'gpu': 10.83, 'ram': 5.13, 'retries': 0, 'oom': 0}
                         device=cuda fn=reload_model_weights:load_diffuser time=0.23
11:04:25-211522 INFO     Load model: time=57.43 load=54.06 move=3.33 native=1024 memory={'ram': {'used': 5.13, 'total':
                         63.92}, 'gpu': {'used': 10.83, 'total': 23.99}, 'retries': 0, 'oom': 0}
11:04:25-214513 DEBUG    Script callback init time: image_browser.py:ui_tabs=0.44 system-info.py:app_started=0.07
                         task_scheduler.py:app_started=0.34
11:04:25-216536 INFO     Startup time: 71.46 torch=5.72 gradio=1.57 diffusers=0.61 libraries=1.79 extensions=0.79
                         detailer=0.08 ui-networks=0.23 ui-txt2img=0.09 ui-img2img=0.23 ui-control=0.15 ui-settings=0.26
                         ui-extensions=1.10 ui-defaults=0.26 launch=0.14 api=0.09 app-started=0.41 checkpoint=57.67
11:04:25-218503 DEBUG    Save: file="config.json" json=34 bytes=1513 time=0.004
11:04:25-220497 DEBUG    Unused settings: ['cross_attention_options', 'face_restoration_unload']
11:04:33-391093 INFO     Load model: select="Diffusers\Disty0/FLUX.1-dev-qint4 [82811df42b]"
11:04:33-393088 DEBUG    Load model:
                         path="models\Diffusers\models--Disty0--FLUX.1-dev-qint4\snapshots\82811df42b556a1153b971d8375d5
                         170c306a6eb"
11:04:33-394085 INFO     Autodetect model: detect="FLUX" class=FluxPipeline
                         file="models\Diffusers\models--Disty0--FLUX.1-dev-qint4\snapshots\82811df42b556a1153b971d8375d5
                         170c306a6eb" size=0MB
11:04:33-397078 DEBUG    Load model: type=FLUX model="Diffusers\Disty0/FLUX.1-dev-qint4" repo="Disty0/FLUX.1-dev-qint4"
                         unet="fluxDEVDEDISTILLED_q4KM.gguf" t5="None" vae="None" quant=qint4 offload=none
                         dtype=torch.bfloat16
11:04:33-398075 DEBUG    HF login: no token provided
11:04:33-399073 INFO     Load module: type=UNet/Transformer
                         file="C:\ai\automatic\models\UNET\fluxDEVDEDISTILLED_q4KM.gguf" offload=none quant=gguf
                         dtype=torch.bfloat16
11:04:33-402064 ERROR    Load model: type=FLUX Failed to load UNet: No module named 'gguf'
11:04:33-403061 ERROR    FLUX UNet:: ModuleNotFoundError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\model_flux.py:188 in load_flux                                                               │
│                                                                                                                      │
│   187 │   │   │   debug(f'Load model: type=FLUX unet="{shared.opts.sd_unet}"')                                       │
│ ❱ 188 │   │   │   _transformer = load_transformer(sd_unet.unet_dict[shared.opts.sd_unet])                            │
│   189 │   │   │   if _transformer is not None:                                                                       │
│                                                                                                                      │
│ C:\ai\automatic\modules\model_flux.py:149 in load_transformer                                                        │
│                                                                                                                      │
│   148 │   if 'gguf' in file_path.lower():                                                                            │
│ ❱ 149 │   │   _transformer, _text_encoder_2 = load_flux_gguf(file_path)                                              │
│   150 │   │   if _transformer is not None:                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\modules\model_flux.py:115 in load_flux_gguf                                                          │
│                                                                                                                      │
│   114 │   from diffusers.loaders.single_file_utils import convert_flux_transformer_checkpoint_to_diffusers           │
│ ❱ 115 │   from modules import ggml, sd_hijack_accelerate                                                             │
│   116 │   model_te.install_gguf()                                                                                    │
│                                                                                                                      │
│ C:\ai\automatic\modules\ggml\__init__.py:3 in <module>                                                               │
│                                                                                                                      │
│    2 import torch                                                                                                    │
│ ❱  3 import gguf                                                                                                     │
│    4 from .gguf_utils import TORCH_COMPATIBLE_QTYPES                                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'gguf'
11:05:27-038658 DEBUG    Load model: type=FLUX preloaded=['transformer', 'text_encoder_2']
Diffusers 12.69it/s ████████ 100% 7/7 00:00 00:00 Loading pipeline components...
11:05:27-929975 INFO     Load network: type=embeddings loaded=0 skipped=13 time=0.02
11:05:27-931975 DEBUG    Setting model: component=VAE slicing=True
11:05:27-933183 DEBUG    Setting model: attention="Scaled-Dot-Product"
11:05:27-950140 DEBUG    Setting model: offload=none
11:05:31-734752 DEBUG    GC: utilization={'gpu': 88, 'ram': 8, 'threshold': 80} gc={'collected': 237, 'saved': 0.01}
                         before={'gpu': 21.17, 'ram': 5.18} after={'gpu': 21.16, 'ram': 5.18, 'retries': 0, 'oom': 0}
                         device=cuda fn=load_diffuser:move_model time=0.25
11:05:31-993062 DEBUG    GC: utilization={'gpu': 88, 'ram': 8, 'threshold': 80} gc={'collected': 127, 'saved': 0.0}
                         before={'gpu': 21.17, 'ram': 5.18} after={'gpu': 21.17, 'ram': 5.18, 'retries': 0, 'oom': 0}
                         device=cuda fn=load_unet:load_diffuser time=0.26
11:05:32-000043 INFO     Load model: time=58.35 load=54.52 move=3.79 native=1024 memory={'ram': {'used': 5.18, 'total':
                         63.92}, 'gpu': {'used': 21.18, 'total': 23.99}, 'retries': 0, 'oom': 0}
11:05:32-227922 DEBUG    GC: utilization={'gpu': 88, 'ram': 8, 'threshold': 80} gc={'collected': 254, 'saved': 0.0}
                         before={'gpu': 21.17, 'ram': 5.18} after={'gpu': 21.17, 'ram': 5.18, 'retries': 0, 'oom': 0}
                         device=cuda fn=<lambda>:load_unet time=0.23
11:05:32-253632 DEBUG    Setting changed: sd_unet=fluxDEVDEDISTILLED_q4KM.gguf progress=True
11:05:32-255600 DEBUG    Save: file="config.json" json=34 bytes=1513 time=0.002
11:05:32-257621 DEBUG    Unused settings: ['cross_attention_options', 'face_restoration_unload']
11:06:00-257698 DEBUG    Server: alive=True jobs=1 requests=30 uptime=158 memory=5.16/63.92 backend=Backend.DIFFUSERS
                         state=idle
Backend

Diffusers
UI

Standard
Branch

Dev
Model

Other
Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue
vladmandic / automatic