flux nf4 error - Githubissues

Hi!

Fresh Forge installation with git clone, everything is default... your guys 11GB dev nf4 bnb model ( that model file also linked to comfyui & works there just fine ) Diffusion with Low Bits: nf4, Swap Method: queue, swap: CPU (screenshot attached below ) GPU= RTX3060 with 12GB vram, GPU Weights (MB) 11263 (default ) no extensions installed forge IS able to generate images in SD15 & SDXL sections with SD15&SDXL models... I tried with 17GB FP8 DEV and shnell models from https://huggingface.co/Comfy-Org/flux1-dev and got the same error

txt2img prompt: a dog

full log:

venv "D:\forge\venv\Scripts\Python.exe" Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: f2.0.1v1.10.1-previous-250-g3589b57e Commit hash: 3589b57ec1f59fd7c570b08c85e0108cf3ff67d6 Launching Web UI with arguments: --pin-shared-memory --cuda-malloc --cuda-stream --xformers --ckpt-dir 'D:\models\Stable-diffusion' --lora-dir 'D:\models\Lora' --vae-dir 'D:\models\VAE' --embeddings-dir 'D:\models\embeddings' --controlnet-dir 'D:\models\controlnet' --device-id=0 --skip-load-model-at-start --autolaunch Using cudaMallocAsync backend. Total VRAM 12287 MB, total RAM 32719 MB pytorch version: 2.3.1+cu121 WARNING:xformers:A matching Triton is not available, some optimizations will not be enabled Traceback (most recent call last): File "D:\forge\venv\lib\site-packages\xformers__init__.py", line 57, in _is_triton_available import triton # noqa ModuleNotFoundError: No module named 'triton' xformers version: 0.0.27 Set vram state to: NORMAL_VRAM Always pin shared GPU memory Device: cuda:0 NVIDIA GeForce RTX 3060 : cudaMallocAsync VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16 CUDA Using Stream: True Using xformers cross attention Using xformers attention for VAE ControlNet preprocessor location: D:\forge\models\ControlNetPreprocessor [-] ADetailer initialized. version: 24.8.0, num models: 10 2024-08-13 08:44:43,860 - ControlNet - INFO - ControlNet UI callback registered. Model selected: {'checkpoint_info': {'filename': 'D:\models\Stable-diffusion\FLUX_nf4_comfyUI_regular_node_DEV_BNB.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': 'nf4'} Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Startup time: 21.8s (prepare environment: 3.0s, launcher: 2.5s, import torch: 5.6s, initialize shared: 0.5s, other imports: 0.8s, list SD models: 0.7s, load scripts: 2.9s, create ui: 3.2s, gradio launch: 2.4s). Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False} Model selected: {'checkpoint_info': {'filename': 'D:\models\Stable-diffusion\FLUX_nf4_comfyUI_regular_node_DEV_BNB.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': None} Model selected: {'checkpoint_info': {'filename': 'D:\models\Stable-diffusion\FLUX_nf4_comfyUI_regular_node_DEV_BNB.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': 'nf4'} Loading Model: {'checkpoint_info': {'filename': 'D:\models\Stable-diffusion\FLUX_nf4_comfyUI_regular_node_DEV_BNB.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': 'nf4'} StateDict Keys: {'transformer': 2350, 'vae': 244, 'text_encoder': 198, 'text_encoder_2': 220, 'ignore': 0} Using Detected T5 Data Type: torch.float8_e4m3fn Working with z of shape (1, 16, 32, 32) = 16384 dimensions. K-Model Created: {'storage_dtype': 'nf4', 'computation_dtype': torch.bfloat16} Model loaded in 1.3s (unload existing model: 0.3s, load state dict: 0.1s, forge model load: 0.9s). Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored. To load target model ModuleDict Begin to load 1 model [Memory Management] Current Free GPU Memory: 11232.66 MB [Memory Management] Required Model Memory: 5154.62 MB [Memory Management] Required Inference Memory: 1024.00 MB [Memory Management] Estimated Remaining GPU Memory: 5054.04 MB Moving model(s) has taken 8.61 seconds Traceback (most recent call last): File "D:\forge\modules_forge\main_thread.py", line 37, in loop task.work() File "D:\forge\modules_forge\main_thread.py", line 26, in work self.result = self.func(*self.args, self.kwargs) File "D:\forge\modules\txt2img.py", line 110, in txt2img_function processed = processing.process_images(p) File "D:\forge\modules\processing.py", line 799, in process_images res = process_images_inner(p) File "D:\forge\modules\processing.py", line 912, in process_images_inner p.setup_conds() File "D:\forge\modules\processing.py", line 1497, in setup_conds super().setup_conds() File "D:\forge\modules\processing.py", line 494, in setup_conds self.c = self.get_conds_with_caching(prompt_parser.get_multicond_learned_conditioning, prompts, total_steps, [self.cached_c], self.extra_network_data) File "D:\forge\modules\processing.py", line 463, in get_conds_with_caching cache[1] = function(shared.sd_model, required_prompts, steps, hires_steps, shared.opts.use_old_scheduling) File "D:\forge\modules\prompt_parser.py", line 262, in get_multicond_learned_conditioning learned_conditioning = get_learned_conditioning(model, prompt_flat_list, steps, hires_steps, use_old_scheduling) File "D:\forge\modules\prompt_parser.py", line 189, in get_learned_conditioning conds = model.get_learned_conditioning(texts) File "D:\forge\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "D:\forge\backend\diffusion_engine\flux.py", line 79, in get_learned_conditioning cond_t5 = self.text_processing_engine_t5(prompt) File "D:\forge\backend\text_processing\t5_engine.py", line 123, in call z = self.process_tokens([tokens], [multipliers])[0] File "D:\forge\backend\text_processing\t5_engine.py", line 134, in process_tokens z = self.encode_with_transformers(tokens) File "D:\forge\backend\text_processing\t5_engine.py", line 60, in encode_with_transformers z = self.text_encoder( File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "D:\forge\backend\nn\t5.py", line 205, in forward return self.encoder(x, *args, *kwargs) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "D:\forge\backend\nn\t5.py", line 186, in forward x, past_bias = l(x, mask, past_bias) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "D:\forge\backend\nn\t5.py", line 162, in forward x, past_bias = self.layer[0](x, mask, past_bias) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\forge\backend\nn\t5.py", line 149, in forward output, past_bias = self.SelfAttention(self.layer_norm(x), mask=mask, past_bias=past_bias) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "D:\forge\backend\nn\t5.py", line 138, in forward out = attention_function(q, k * ((k.shape[-1] / self.num_heads) * 0.5), v, self.num_heads, mask) File "D:\forge\backend\attention.py", line 314, in attention_xformers mask_out[:, :, :mask.shape[-1]] = mask RuntimeError: The expanded size of the tensor (1) must match the existing size (64) at non-singleton dimension 0. Target sizes: [1, 256, 256]. Tensor sizes: [64, 256, 256] The expanded size of the tensor (1) must match the existing size (64) at non-singleton dimension 0. Target sizes: [1, 256, 256]. Tensor sizes: [64, 256, 256] Error completing request Arguments: ('task(bgk0ve4ytw2swgx)', <gradio.route_utils.Request object at 0x000000005426FE20>, 'a dog', '', [], 1, 1, 1, 3.5, 1152, 896, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', None, 0, 20, 'Euler', 'Simple', False, -1, False, -1, 0, 0, 0, False, False, {'ad_model': 'face_yolov8n.pt', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), False, 7, 1, 'Constant', 0, 'Constant', 0, 1, 'enable', 'MEAN', 'AD', 1, False, 1.01, 1.02, 0.99, 0.95, False, 0.5, 2, False, 3, False, 3, 2, 0, 0.35, True, 'bicubic', 'bicubic', False, 0, 'anisotropic', 0, 'reinhard', 100, 0, 'subtract', 0, 0, 'gaussian', 'add', 0, 100, 127, 0, 'hard_clamp', 5, 0, 'None', 'None', False, 'MultiDiffusion', 768, 768, 64, 4, False, False, False, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', '', 0, '', '', 0, '', '', True, False, False, False, False, False, False, 0, False) {} Traceback (most recent call last): File "D:\forge\modules\call_queue.py", line 74, in f res = list(func(args, kwargs)) TypeError: 'NoneType' object is not iterable

Untitled

lllyasviel / stable-diffusion-webui-forge

flux nf4 error #1049