How to unload model? - Githubissues

lingondricka2 commented 9 months ago

Is there a way to unload a specified model from memory?

Chaoses-Ib commented 9 months ago

Are you using real mode? It's possible, though a bit cumbersome:

import comfy.model_management

model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
conditioning = CLIPTextEncode('text, watermark', clip)

print(comfy.model_management.current_loaded_models)
# [<comfy.model_management.LoadedModel object at 0x0000014C2287EF80>, <comfy.model_management.LoadedModel object at 0x0000014C2287EB00>]

comfy.model_management.unload_model_clones(model)
comfy.model_management.unload_model_clones(clip.patcher)
comfy.model_management.unload_model_clones(vae.patcher)
print(comfy.model_management.current_loaded_models)
# []

Or you can just unload all models:

comfy.model_management.unload_all_models()

lingondricka2 commented 9 months ago

Thank you, you are very helpful as usual, I switched to real mode and with torch.inference_mode():, and it works great. Everything seems faster with with torch.inference_mode(): also, maybe I'm wrong.

Also I assume you don't want me to close issues tagged with 'documentation' ?

Chaoses-Ib commented 9 months ago

Everything seems faster with with torch.inference_mode(): also, maybe I'm wrong.

That's correct. Without inference mode, features related to autograd will be enabled and make inference slower. It's also included in with Workflow().

Also I assume you don't want me to close issues tagged with 'documentation' ?

Yeah, keeping this open can help people having similar issues. It's also tagged with enhancement because some utility functions may be provided in future versions, mainly to avoid possible RAM (not VRAM) leaks when using Jupyter Notebook.

lingondricka2 commented 9 months ago

I noticed the workflow was not embedded in the resulting images when using real mode, is that a bug?

Chaoses-Ib commented 9 months ago

Oh, I forgot to mention that in docs. In real mode, nodes are executed directly and they have no idea about the workflow, so SaveImage will not get any metadata to save. It's possible to save the script source automatically, but if the inputs are dynamic then it may not be a reproducible copy.

One way to solve this is "mix mode", i.e. run real nodes and virtual nodes at the same time, and pass the workflow generated by virtual nodes to SaveImage(). But that's a bit complex to implement.

Chaoses-Ib commented 9 months ago

Another way to workaround this is to add custom metadata, for example:

positive = 'beautiful scenery nature glass bottle landscape, , purple galaxy bottle,'
negative = 'text, watermark'
model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
latent = EmptyLatentImage(512, 512, 1)
latent = KSampler(model, 123, 20, 8, 'euler', 'normal', CLIPTextEncode(positive, clip), CLIPTextEncode(negative, clip), latent)
image = VAEDecode(latent, vae)
SaveImage(image, 'ComfyUI', extra_pnginfo={'myfield': {'positive': positive, 'negative': negative}})

from PIL import Image
image = Image.open(r'D:\ComfyUI\output\ComfyUI_00001_.png')
print(image.info['myfield'])
# {"positive": "beautiful scenery nature glass bottle landscape, , purple galaxy bottle,", "negative": "text, watermark"}

lingondricka2 commented 9 months ago

Not relevant to ComfyScript but..

I made a custom node that unloads unused models, but it's not working, is this because ComfyUI uses tail end recursion method?

Untitled

Chaoses-Ib commented 9 months ago

Recursive execution doesn't affect the execution order of nodes. About your problem, it's probably because the model is not loaded into VRAM when ModelUnLoader is called. comfy.model_management only manages models in VRAM, if it's in RAM then nothing will happen. I wonder why you want to do this, ComfyUI should unload previous models if there isn't enough free VRAM.

lingondricka2 commented 9 months ago

With 4 checkpoints loaded the ksampler is incredible slow 20 seconds per iteration and sometimes crash the computer compared to 5 iteration per second with 3 checkpoints loaded.

lingondricka2 commented 9 months ago

I got good speed with 4 checkpoints using my script with unload_model_clones(model) in real mode, but no metadata in the image is a problem and I'm searching for the easiest solution :)

Chaoses-Ib commented 9 months ago

Can you provide the script, the workflow json and the code of your ModelUnLoader?

Chaoses-Ib commented 9 months ago

One way to solve this is "mix mode", i.e. run real nodes and virtual nodes at the same time, and pass the workflow generated by virtual nodes to SaveImage(). But that's a bit complex to implement.

After giving up not wrapping real mode outputs, it becomes feasible to implement this. Now workflows will be automatically tracked and saved to images. But note that changes to inputs made by user code instead of nodes will not be tracked.

To update, git pull and python -m pip install -e .[default] should be used since there is one new dependency.

Chaoses-Ib commented 9 months ago

By the way, one point I found with making this is that, there is actually no fundamental difference between virtual mode and real mode. They may be merged back into one universe mode in the future. Though runtime.real and runtime.real.nodes may be kept to maintain compatibility.

lingondricka2 commented 9 months ago

I updated, error now if import torch before

from comfy_script.runtime.real import *
load()
from comfy_script.runtime.real.nodes import *

No issue for me, but I thought I mention it.

Chaoses-Ib commented 9 months ago

This actually should be a fix instead of a bug. The new version will try to simulate the real ComfyUI environment as far as possible, both to keep compatible with custom nodes and reduce the maintain cost. One of those things is setting PyTorch environment variables, which was missed in the old version. Though I'm not sure what they are exactly used for, it's better to follow this behavior. Thanks for your report, I'll add it to docs.

If you want, you can disable it with:

load(args=ComfyUIArgs('--disable-cuda-malloc'))

Or you can use the old simulation mechanism:

import comfy_script.runtime as runtime
runtime.start_comfyui(no_server=True, autonomy=True)

Chaoses-Ib commented 9 months ago

Now if torch is imported before load(), the runtime will automatically disable cuda malloc and print a warning.

lingondricka2 commented 9 months ago

I got good speed with 4 checkpoints using my script with unload_model_clones(model) in real mode

I take that back, I still have problems, when RAM usage reaches 100% my computer enters slideshow mode and sometimes crashes, I've been experimenting with Comfyui commands args (--highvram, --gpu-only etc) and calling functions in comfy.model_management, but the thing that seems to be working (I've generated 200 images in a loop with no problems) is settings reference count to 0 by using del. example.

                model1, clip, vae = CheckpointLoaderSimple(checkpoint1) 
                model2, clip, vae = CheckpointLoaderSimple(checkpoint2) 
                model = ModelMergeBlocks(model1, model2, random.uniform(0, 1), random.uniform(0, 1), random.uniform(0, 1))
                del model1
                del model2
                comfy.model_management.soft_empty_cache() # for good measure, dunno if it does anything

Stress tested by merging 6 checkpoints, 4 loras, embeddings etc with 50 tabs open in chrome and streaming a movie lol

Chaoses-Ib commented 9 months ago

Oh, sorry I didn't mention that before. Models are Python objects, and Python is a garbage collection language, the only way to free memory used by objects is by removing all references to them. Normally this should be done implicitly, for example, when the variable scope is exited, or the variable is assigned with another object. del is needed when free memory is low and implicit freeing objects is not enough.

However, removing all references is necessary but not sufficient to free the memory. gc.collect() can be used to make sure the garbage collection happens. And sys.getrefcount() can be used to check whether the reference count is 1 or not before del, to make sure gc will actually collect it. (Another option is weakref.ref().)

soft_empty_cache() seems only related to VRAM. I'm not sure what impact it has on the performance.

By the way, as I mentioned in "some utility functions may be provided in future versions, mainly to avoid possible RAM (not VRAM) leaks when using Jupyter Notebook", in Jupyter Notebook, just del may not remove all the references to your variables because IPython may hold some secret references to them. Be careful when the script is used in Jupyter Notebook.

Chaoses-Ib commented 9 months ago

I came up with an idea to ease these problem: the runtime can replace all returned models with weak ref proxies, and store the only strong ref in an internal container. This way, the runtime can provide methods like model.free() to surely free the model both in RAM and VRAM. This also solves the problem of Jupyter Notebook. And Workflow can support features like with Workflow(free_models=True) to make sure no model is leaked to memory when the workflow is done.

The only problem is, how does the runtime know which types are models and which aren't. ModelPatcher may be checked as a sign of models.

danyow-cheung commented 9 months ago

你用的是实模式吗？这是可能的，虽然有点麻烦：

import comfy.model_management

model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
conditioning = CLIPTextEncode('text, watermark', clip)

print(comfy.model_management.current_loaded_models)
# [<comfy.model_management.LoadedModel object at 0x0000014C2287EF80>, <comfy.model_management.LoadedModel object at 0x0000014C2287EB00>]

comfy.model_management.unload_model_clones(model)
comfy.model_management.unload_model_clones(clip.patcher)
comfy.model_management.unload_model_clones(vae.patcher)
print(comfy.model_management.current_loaded_models)
# []

或者您可以卸载所有模型：

comfy.model_management.unload_all_models()

Thanks to the work.besides i want to know the way of unloading the loras loader ,vae loader

Chaoses-Ib commented 9 months ago

The unloading method depends on the returned type, not the loader. LoraLoader and VAELoader return the same types as returned by CheckpointLoaderSimple, so unload_model_clones(model) will also work.

For example:

model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
print(model, clip, vae)
# <comfy.model_patcher.ModelPatcher object at 0x0000027EA7042AD0> <comfy.sd.CLIP object at 0x0000027EA7040CA0> <comfy.sd.VAE object at 0x0000027EA9A63730>

model, clip = LoraLoader(model, clip, Loras.sd14_sliders_age_sd14)
vae = VAELoader(VAEs.vae_ft_mse_840000_ema_pruned)
print(model, clip, vae)
# <comfy.model_patcher.ModelPatcher object at 0x0000027EA9E34CD0> <comfy.sd.CLIP object at 0x0000027EA9E36A40> <comfy.sd.VAE object at 0x0000027EA70431F0>

conditioning = CLIPTextEncode('text, watermark', clip)
latent = KSampler(model, positive=conditioning, negative=conditioning, latent_image=EmptyLatentImage())
image = VAEDecode(latent, vae)

print(comfy.model_management.current_loaded_models)
# [<comfy.model_management.LoadedModel object at 0x0000027EA9A638E0>, <comfy.model_management.LoadedModel object at 0x0000027EA7041330>, <comfy.model_management.LoadedModel object at 0x0000027EA70419F0>]

comfy.model_management.unload_model_clones(model)
comfy.model_management.unload_model_clones(clip.patcher)
comfy.model_management.unload_model_clones(vae.patcher)
comfy.model_management.soft_empty_cache()

print(comfy.model_management.current_loaded_models)
# []

danyow-cheung commented 9 months ago

The unloading method depends on the returned type, not the loader. LoraLoader and VAELoader return the same types as returned by CheckpointLoaderSimple, so unload_model_clones(model) will also work.

For example:

model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
print(model, clip, vae)
# <comfy.model_patcher.ModelPatcher object at 0x0000027EA7042AD0> <comfy.sd.CLIP object at 0x0000027EA7040CA0> <comfy.sd.VAE object at 0x0000027EA9A63730>

model, clip = LoraLoader(model, clip, Loras.sd14_sliders_age_sd14)
vae = VAELoader(VAEs.vae_ft_mse_840000_ema_pruned)
print(model, clip, vae)
# <comfy.model_patcher.ModelPatcher object at 0x0000027EA9E34CD0> <comfy.sd.CLIP object at 0x0000027EA9E36A40> <comfy.sd.VAE object at 0x0000027EA70431F0>

conditioning = CLIPTextEncode('text, watermark', clip)
latent = KSampler(model, positive=conditioning, negative=conditioning, latent_image=EmptyLatentImage())
image = VAEDecode(latent, vae)

print(comfy.model_management.current_loaded_models)
# [<comfy.model_management.LoadedModel object at 0x0000027EA9A638E0>, <comfy.model_management.LoadedModel object at 0x0000027EA7041330>, <comfy.model_management.LoadedModel object at 0x0000027EA70419F0>]

comfy.model_management.unload_model_clones(model)
comfy.model_management.unload_model_clones(clip.patcher)
comfy.model_management.unload_model_clones(vae.patcher)
comfy.model_management.soft_empty_cache()

print(comfy.model_management.current_loaded_models)
# []

Great! let me give you a star lol

aleeepp commented 6 months ago

Is there a way to unload model from the RAM from nodes? I am using different models in the same workflow and it crashes due to ram!

thank you!!

Chaoses-Ib commented 6 months ago

@aleeepp

Models are Python objects, and Python is a garbage collection language, the only way to free memory used by objects is by removing all references to them. Normally this should be done implicitly, for example, when the variable scope is exited, or the variable is assigned with another object. del is needed when free memory is low and implicit freeing objects is not enough.

However, removing all references is necessary but not sufficient to free the memory. gc.collect() can be used to make sure the garbage collection happens. And sys.getrefcount() can be used to check whether the reference count is 1 or not before del, to make sure gc will actually collect it. (Another option is weakref.ref().)

For example:

import gc

model, clip, vae = CheckpointLoaderSimple(Checkpoints.v1_5_pruned_emaonly)
...
del model, clip, vae
gc.collect()

Chaoses-Ib / ComfyScript

How to unload model? #21