comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
50.45k stars 5.3k forks source link

flux-dev, executed in long time. #4561

Open torans opened 3 weeks ago

torans commented 3 weeks ago

Your question

Prompt executed in 287.69 seconds

Logs

got prompt
Using pytorch attention in VAE
Using pytorch attention in VAE
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
/root/miniconda3/envs/comfy/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
clip missing: ['text_projection.weight']
Requested to load FluxClipModel_
Loading 1 new model
loaded completely 0.0 4777.53759765625 True
Requested to load Flux
Loading 1 new model
loaded completely 0.0 11350.048889160156 True
100%|██████████████████████████████████████████████████████████████████████| 20/20 [03:06<00:00,  9.31s/it]
Requested to load AutoencodingEngine
Loading 1 new model
loaded completely 0.0 159.87335777282715 True
Prompt executed in 287.69 seconds
got prompt
100%|██████████████████████████████████████████████████████████████████████| 20/20 [03:09<00:00,  9.48s/it]
Prompt executed in 194.50 seconds

Other

No response

ltdrdata commented 2 weeks ago

What is the size of your VRAM? And check if shared memory is turned off.

Woukim commented 2 weeks ago

@ltdrdata Hi, do you happen to know why FluxClipModel_ is reloaded every time a change is made in the prompt?

First of all, is it really normal that it is loaded if DualClipLoader node is used? I'm not an expert on how comfy works and SD in general, so I'd like to make sure this isn't a bug and the flux clip model should load.

Secondly, does it load and unload from VRAM every time because there is not enough memory? The line in the console: “loaded completely 0.0 5180.35888671875 True” meaning it is loaded with 5.1gb? If so, then I can see why it's unloading. After all, my video card only has 16gb, but if 5.1 + 13,600 ⁓ 19gb!

image

ltdrdata commented 2 weeks ago

@ltdrdata Hi, do you happen to know why FluxClipModel_ is reloaded every time a change is made in the prompt?

First of all, is it really normal that it is loaded if DualClipLoader node is used? I'm not an expert on how comfy works and SD in general, so I'd like to make sure this isn't a bug and the flux clip model should load.

Secondly, does it load and unload from VRAM every time because there is not enough memory? The line in the console: “loaded completely 0.0 5180.35888671875 True” meaning it is loaded with 5.1gb? If so, then I can see why it's unloading. After all, my video card only has 16gb, but if 5.1 + 13,600 ⁓ 19gb!

That's correct. It's because your VRAM doesn't have enough space to load both the Diffusion Model and the T5 model simultaneously.

The T5 model (FluxClipModel_) is unloaded from VRAM to make space before loading the Diffusion model for sampling.