Open lingondricka2 opened 9 months ago
Are you using real mode? It's possible, though a bit cumbersome:
import comfy.model_management
model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
conditioning = CLIPTextEncode('text, watermark', clip)
print(comfy.model_management.current_loaded_models)
# [<comfy.model_management.LoadedModel object at 0x0000014C2287EF80>, <comfy.model_management.LoadedModel object at 0x0000014C2287EB00>]
comfy.model_management.unload_model_clones(model)
comfy.model_management.unload_model_clones(clip.patcher)
comfy.model_management.unload_model_clones(vae.patcher)
print(comfy.model_management.current_loaded_models)
# []
Or you can just unload all models:
comfy.model_management.unload_all_models()
Thank you, you are very helpful as usual, I switched to real mode and with torch.inference_mode():
, and it works great. Everything seems faster with with torch.inference_mode():
also, maybe I'm wrong.
Also I assume you don't want me to close issues tagged with 'documentation' ?
Everything seems faster with
with torch.inference_mode():
also, maybe I'm wrong.
That's correct. Without inference mode, features related to autograd will be enabled and make inference slower. It's also included in with Workflow()
.
Also I assume you don't want me to close issues tagged with 'documentation' ?
Yeah, keeping this open can help people having similar issues. It's also tagged with enhancement
because some utility functions may be provided in future versions, mainly to avoid possible RAM (not VRAM) leaks when using Jupyter Notebook.
I noticed the workflow was not embedded in the resulting images when using real mode, is that a bug?
Oh, I forgot to mention that in docs. In real mode, nodes are executed directly and they have no idea about the workflow, so SaveImage
will not get any metadata to save. It's possible to save the script source automatically, but if the inputs are dynamic then it may not be a reproducible copy.
One way to solve this is "mix mode", i.e. run real nodes and virtual nodes at the same time, and pass the workflow generated by virtual nodes to SaveImage()
. But that's a bit complex to implement.
Another way to workaround this is to add custom metadata, for example:
positive = 'beautiful scenery nature glass bottle landscape, , purple galaxy bottle,'
negative = 'text, watermark'
model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
latent = EmptyLatentImage(512, 512, 1)
latent = KSampler(model, 123, 20, 8, 'euler', 'normal', CLIPTextEncode(positive, clip), CLIPTextEncode(negative, clip), latent)
image = VAEDecode(latent, vae)
SaveImage(image, 'ComfyUI', extra_pnginfo={'myfield': {'positive': positive, 'negative': negative}})
from PIL import Image
image = Image.open(r'D:\ComfyUI\output\ComfyUI_00001_.png')
print(image.info['myfield'])
# {"positive": "beautiful scenery nature glass bottle landscape, , purple galaxy bottle,", "negative": "text, watermark"}
Not relevant to ComfyScript but..
I made a custom node that unloads unused models, but it's not working, is this because ComfyUI uses tail end recursion method?
Recursive execution doesn't affect the execution order of nodes. About your problem, it's probably because the model is not loaded into VRAM when ModelUnLoader
is called. comfy.model_management
only manages models in VRAM, if it's in RAM then nothing will happen. I wonder why you want to do this, ComfyUI should unload previous models if there isn't enough free VRAM.
With 4 checkpoints loaded the ksampler is incredible slow 20 seconds per iteration and sometimes crash the computer compared to 5 iteration per second with 3 checkpoints loaded.
I got good speed with 4 checkpoints using my script with unload_model_clones(model)
in real mode, but no metadata in the image is a problem and I'm searching for the easiest solution :)
Can you provide the script, the workflow json and the code of your ModelUnLoader
?
One way to solve this is "mix mode", i.e. run real nodes and virtual nodes at the same time, and pass the workflow generated by virtual nodes to
SaveImage()
. But that's a bit complex to implement.
After giving up not wrapping real mode outputs, it becomes feasible to implement this. Now workflows will be automatically tracked and saved to images. But note that changes to inputs made by user code instead of nodes will not be tracked.
To update, git pull
and python -m pip install -e .[default]
should be used since there is one new dependency.
By the way, one point I found with making this is that, there is actually no fundamental difference between virtual mode and real mode. They may be merged back into one universe mode in the future. Though runtime.real
and runtime.real.nodes
may be kept to maintain compatibility.
I updated, error now if import torch
before
from comfy_script.runtime.real import *
load()
from comfy_script.runtime.real.nodes import *
Probably related to http://github.com/pytorch/pytorch/issues/104801
No issue for me, but I thought I mention it.
This actually should be a fix instead of a bug. The new version will try to simulate the real ComfyUI environment as far as possible, both to keep compatible with custom nodes and reduce the maintain cost. One of those things is setting PyTorch environment variables, which was missed in the old version. Though I'm not sure what they are exactly used for, it's better to follow this behavior. Thanks for your report, I'll add it to docs.
If you want, you can disable it with:
load(args=ComfyUIArgs('--disable-cuda-malloc'))
Or you can use the old simulation mechanism:
import comfy_script.runtime as runtime
runtime.start_comfyui(no_server=True, autonomy=True)
Now if torch
is imported before load()
, the runtime will automatically disable cuda malloc and print a warning.
I got good speed with 4 checkpoints using my script with unload_model_clones(model) in real mode
I take that back, I still have problems, when RAM usage reaches 100% my computer enters slideshow mode and sometimes crashes, I've been experimenting with Comfyui commands args (--highvram, --gpu-only etc) and calling functions in comfy.model_management, but the thing that seems to be working (I've generated 200 images in a loop with no problems) is settings reference count to 0 by using del. example.
model1, clip, vae = CheckpointLoaderSimple(checkpoint1)
model2, clip, vae = CheckpointLoaderSimple(checkpoint2)
model = ModelMergeBlocks(model1, model2, random.uniform(0, 1), random.uniform(0, 1), random.uniform(0, 1))
del model1
del model2
comfy.model_management.soft_empty_cache() # for good measure, dunno if it does anything
Stress tested by merging 6 checkpoints, 4 loras, embeddings etc with 50 tabs open in chrome and streaming a movie lol
Oh, sorry I didn't mention that before. Models are Python objects, and Python is a garbage collection language, the only way to free memory used by objects is by removing all references to them. Normally this should be done implicitly, for example, when the variable scope is exited, or the variable is assigned with another object. del
is needed when free memory is low and implicit freeing objects is not enough.
However, removing all references is necessary but not sufficient to free the memory. gc.collect()
can be used to make sure the garbage collection happens. And sys.getrefcount()
can be used to check whether the reference count is 1 or not before del
, to make sure gc will actually collect it. (Another option is weakref.ref()
.)
soft_empty_cache()
seems only related to VRAM. I'm not sure what impact it has on the performance.
By the way, as I mentioned in "some utility functions may be provided in future versions, mainly to avoid possible RAM (not VRAM) leaks when using Jupyter Notebook", in Jupyter Notebook, just del
may not remove all the references to your variables because IPython may hold some secret references to them. Be careful when the script is used in Jupyter Notebook.
I came up with an idea to ease these problem: the runtime can replace all returned models with weak ref proxies, and store the only strong ref in an internal container. This way, the runtime can provide methods like model.free()
to surely free the model both in RAM and VRAM. This also solves the problem of Jupyter Notebook. And Workflow
can support features like with Workflow(free_models=True)
to make sure no model is leaked to memory when the workflow is done.
The only problem is, how does the runtime know which types are models and which aren't. ModelPatcher
may be checked as a sign of models.
你用的是实模式吗?这是可能的,虽然有点麻烦:
import comfy.model_management model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly) conditioning = CLIPTextEncode('text, watermark', clip) print(comfy.model_management.current_loaded_models) # [<comfy.model_management.LoadedModel object at 0x0000014C2287EF80>, <comfy.model_management.LoadedModel object at 0x0000014C2287EB00>] comfy.model_management.unload_model_clones(model) comfy.model_management.unload_model_clones(clip.patcher) comfy.model_management.unload_model_clones(vae.patcher) print(comfy.model_management.current_loaded_models) # []
或者您可以卸载所有模型:
comfy.model_management.unload_all_models()
Thanks to the work.besides i want to know the way of unloading the loras loader ,vae loader
The unloading method depends on the returned type, not the loader. LoraLoader
and VAELoader
return the same types as returned by CheckpointLoaderSimple
, so unload_model_clones(model)
will also work.
For example:
model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
print(model, clip, vae)
# <comfy.model_patcher.ModelPatcher object at 0x0000027EA7042AD0> <comfy.sd.CLIP object at 0x0000027EA7040CA0> <comfy.sd.VAE object at 0x0000027EA9A63730>
model, clip = LoraLoader(model, clip, Loras.sd14_sliders_age_sd14)
vae = VAELoader(VAEs.vae_ft_mse_840000_ema_pruned)
print(model, clip, vae)
# <comfy.model_patcher.ModelPatcher object at 0x0000027EA9E34CD0> <comfy.sd.CLIP object at 0x0000027EA9E36A40> <comfy.sd.VAE object at 0x0000027EA70431F0>
conditioning = CLIPTextEncode('text, watermark', clip)
latent = KSampler(model, positive=conditioning, negative=conditioning, latent_image=EmptyLatentImage())
image = VAEDecode(latent, vae)
print(comfy.model_management.current_loaded_models)
# [<comfy.model_management.LoadedModel object at 0x0000027EA9A638E0>, <comfy.model_management.LoadedModel object at 0x0000027EA7041330>, <comfy.model_management.LoadedModel object at 0x0000027EA70419F0>]
comfy.model_management.unload_model_clones(model)
comfy.model_management.unload_model_clones(clip.patcher)
comfy.model_management.unload_model_clones(vae.patcher)
comfy.model_management.soft_empty_cache()
print(comfy.model_management.current_loaded_models)
# []
The unloading method depends on the returned type, not the loader.
LoraLoader
andVAELoader
return the same types as returned byCheckpointLoaderSimple
, sounload_model_clones(model)
will also work.For example:
model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly) print(model, clip, vae) # <comfy.model_patcher.ModelPatcher object at 0x0000027EA7042AD0> <comfy.sd.CLIP object at 0x0000027EA7040CA0> <comfy.sd.VAE object at 0x0000027EA9A63730> model, clip = LoraLoader(model, clip, Loras.sd14_sliders_age_sd14) vae = VAELoader(VAEs.vae_ft_mse_840000_ema_pruned) print(model, clip, vae) # <comfy.model_patcher.ModelPatcher object at 0x0000027EA9E34CD0> <comfy.sd.CLIP object at 0x0000027EA9E36A40> <comfy.sd.VAE object at 0x0000027EA70431F0> conditioning = CLIPTextEncode('text, watermark', clip) latent = KSampler(model, positive=conditioning, negative=conditioning, latent_image=EmptyLatentImage()) image = VAEDecode(latent, vae) print(comfy.model_management.current_loaded_models) # [<comfy.model_management.LoadedModel object at 0x0000027EA9A638E0>, <comfy.model_management.LoadedModel object at 0x0000027EA7041330>, <comfy.model_management.LoadedModel object at 0x0000027EA70419F0>] comfy.model_management.unload_model_clones(model) comfy.model_management.unload_model_clones(clip.patcher) comfy.model_management.unload_model_clones(vae.patcher) comfy.model_management.soft_empty_cache() print(comfy.model_management.current_loaded_models) # []
Great! let me give you a star lol
Is there a way to unload model from the RAM from nodes? I am using different models in the same workflow and it crashes due to ram!
thank you!!
@aleeepp
Models are Python objects, and Python is a garbage collection language, the only way to free memory used by objects is by removing all references to them. Normally this should be done implicitly, for example, when the variable scope is exited, or the variable is assigned with another object.
del
is needed when free memory is low and implicit freeing objects is not enough.However, removing all references is necessary but not sufficient to free the memory.
gc.collect()
can be used to make sure the garbage collection happens. Andsys.getrefcount()
can be used to check whether the reference count is 1 or not beforedel
, to make sure gc will actually collect it. (Another option isweakref.ref()
.)
For example:
import gc
model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
...
del model, clip, vae
gc.collect()
Is there a way to unload a specified model from memory?