Last update broke stuffs and keep getting OOM

BlinkerHigh commented 2 weeks ago

Expected Behavior

Finish generation

Actual Behavior

Lora and Controlnet would stop mid generation, with an error message or not, as if it is just hanging. The workflow hasn't changed and it wasn't happening before the update.

Steps to Reproduce

Update confyui and use a lora or controlnet. Using following args: --lowvram --preview-method auto --use-split-cross-attention

Debug Logs

model weight dtype torch.bfloat16, manual cast: None
model_type FLUX
ComfyUI\.venv\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Requested to load FluxClipModel_
Loading 1 new model
loaded completely 0.0 9319.23095703125 True
Requested to load AutoencodingEngine
Loading 1 new model
loaded completely 0.0 159.87335777282715 True
Requested to load InstantXControlNetFluxFormat2
Requested to load Flux
Loading 2 new models
loaded completely 0.0 6298.041015625 True
loaded completely 0.0 12119.472778320312 True

  0% 0/30 [00:00<?, ?it/s]Requested to load AutoencodingEngine
Loading 1 new model
loaded completely 0.0 159.87335777282715 True
Requested to load InstantXControlNetFluxFormat2
Requested to load Flux
Loading 2 new models
loaded completely 0.0 6298.041015625 True
loaded completely 0.0 12119.472778320312 True
ComfyUI\.venv\lib\site-packages\diffusers\models\attention_processor.py:1848: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
  hidden_states = F.scaled_dot_product_attention(query, key, value, dropout_p=0.0, is_causal=False)
0% 0/30 [00:18<?, ?it/s]
!!! Exception during processing !!! Allocation on device 
Traceback (most recent call last):
  File "ComfyUI\execution.py", line 317, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "ComfyUI\execution.py", line 192, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "ComfyUI\comfy_extras\nodes_custom_sampler.py", line 612, in sample
    samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
  File "ComfyUI\comfy\samplers.py", line 716, in sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "ComfyUI\comfy\samplers.py", line 695, in inner_sample
    samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
  File "ComfyUI\comfy\samplers.py", line 600, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "ComfyUI\.venv\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "ComfyUI\comfy\k_diffusion\sampling.py", line 144, in sample_euler
    denoised = model(x, sigma_hat * s_in, **extra_args)
  File "ComfyUI\comfy\samplers.py", line 299, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
  File "ComfyUI\comfy\samplers.py", line 682, in __call__
    return self.predict_noise(*args, **kwargs)
  File "ComfyUI\comfy\samplers.py", line 685, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
  File "ComfyUI\comfy\samplers.py", line 279, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
  File "ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\.patches.py", line 4, in calc_cond_batch
    return calc_cond_batch_original_tiled_diffusion_f15f8412(model, conds, x_in, timestep, model_options)
  File "ComfyUI\comfy\samplers.py", line 202, in calc_cond_batch
    c['control'] = control.get_control(input_x, timestep_, c, len(cond_or_uncond))
  File "ComfyUI\comfy\controlnet.py", line 239, in get_control
    return self.control_merge(control, control_prev, output_dtype)
  File "ComfyUI\comfy\controlnet.py", line 152, in control_merge
    x = x.to(output_dtype)
torch.OutOfMemoryError: Allocation on device 

Got an OOM, unloading all loaded models.
Prompt executed in 152.50 seconds

Other

No response

JorgeR81 commented 2 weeks ago

Are you using InstantX Flux ControlNet ? Is that supported natively ? https://github.com/comfyanonymous/ComfyUI/issues/4567

Or are you using the ComfyUI-eesahesNodes, for the ControlNet ? The issue may be on their side https://github.com/EeroHeikkinen/ComfyUI-eesahesNodes/issues

screan commented 2 weeks ago

getting same exact error, worked fine before update.

BlinkerHigh commented 2 weeks ago

Are you using InstantX Flux ControlNet ? Is that supported natively ? #4567

Or are you using the ComfyUI-eesahesNodes, for the ControlNet ? The issue may be on their side https://github.com/EeroHeikkinen/ComfyUI-eesahesNodes/issues

Considering I have the same issue with Loras, it is unlikely that the controlnet node could be the problem. Also considering the fact that it was working fine before the update.

robertalanbevan commented 2 weeks ago

In my case, final outputs still eventually get produced with or without ControlNet, but now my ControlNet nodes always reload and reprocess their controlling image every time I queue a prompt, which takes a lot of time. I'm on an SD1.5 setup.

Correction: everything, not just ControlNet, reloads every time, including IPAdapter's temporary models. I'm guessing (75%) that this has something to do with the update's new RAM handling adjustments - so this might have nothing to do with the topic's issue.

Since this is probably different, I made a separate issue here

feanorknd commented 2 weeks ago

same for me... "git pull" today and OOM always, although loading a minimum memory Q2 flux model... yesterday everything was fine.

BlinkerHigh commented 2 weeks ago

I'm using the Q8 gguf model

fuhrriel commented 2 weeks ago

Not sure if related, but running on latest commit 9230f65 and getting Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding. every time.

Seeker1970 commented 2 weeks ago

I am thinking if might be more related to the Lora. Since the last update, I can no longer merge Loras into Flux either. Always just says Lora key not loaded a bunch of times in the console window, then loads the model then goes idle and the model doesn't save.

feanorknd commented 2 weeks ago

same for me... "git pull" today and OOM always, although loading a minimum memory Q2 flux model... yesterday everything was fine.

Was not related to Loras or QX model loaded.... tested without loras and with Q2-Q4-Q5, etc... with OOM always.. that was yesterday... don't know if there are fixes today.

comfyanonymous / ComfyUI