city96 / ComfyUI-GGUF

GGUF Quantization support for native ComfyUI models
Apache License 2.0
888 stars 55 forks source link

failed to quantize: unknown model architecture: 'flux' #133

Open GamingDaveUk opened 1 day ago

GamingDaveUk commented 1 day ago

Trying to quantise some flux models to lower the vram needs and I get that error.

(venv) C:\AI\llama.cpp\build>bin\Debug\llama-quantize.exe "C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf" "C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-Q4_k_m.gguf" Q4_K_M
main: build = 3600 (2fb92678)
main: built with MSVC 19.41.34120.0 for x64
main: quantizing 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf' to 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-Q4_k_m.gguf' as Q4_K_M
llama_model_loader: loaded meta data with 3 key-value pairs and 780 tensors from C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = flux
llama_model_loader: - kv   1:               general.quantization_version u32              = 2
llama_model_loader: - kv   2:                          general.file_type u32              = 1
llama_model_loader: - type  f16:  780 tensors
llama_model_quantize: failed to quantize: unknown model architecture: 'flux'
main: failed to quantize model from 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf'

Is flux not supported for quantisastion?

city96 commented 1 day ago

Did the patch apply successfully? That's the default error when you try to use the base llama.cpp llama-quantize binary without the patch applied iirc.

GamingDaveUk commented 1 day ago

no there was a crc error on the patch, i assumed that it meant the patch was already in the main code

I have llama.cpp installed in its own instance so it was a pain to follow the instruction, i may have messed up a step.

I will try again tomorrow when more awake.

city96 commented 1 day ago

Okay yeah, that's probably the problem then. The actual upstream repo isn't meant for image models; the patch is the part that adds support for quantizing flux.

If you post the actual error where the patch apply fails I might be able to help out.

(It could just be this as well, i.e. line ending mismatch due to git converting them when cloning: https://github.com/city96/ComfyUI-GGUF/issues/90#issuecomment-2323011648 )