kijai / ComfyUI-CogVideoXWrapper

980 stars 59 forks source link

Merging rank 256 LoRA weights takes forever #253

Open shomerYu opened 2 days ago

shomerYu commented 2 days ago

Hi!

I'm using you workflow with lora dimension (orbit left). The Merging rank 256 LoRA weights step takes forever (i'm on A100) is there a way to speed up the process?

kijai commented 2 days ago

The node should have choice for load device, it's slow on CPU but should not take long at all on GPU (main_device), it's like 1-2 seconds for me on 4090.

It's not necessary to fuse it unless you want to also use torch.compile, in that case you'd set the strength to (1 / rank), which would be 0.0039