lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
8.59k stars 847 forks source link

Patching LoRAs for KModel: 99%|████████████████████████████████████████████████████▎| 300/304 [01:40<00:01, 2.36it/s] #1256

Closed Kr01iKs closed 1 month ago

Kr01iKs commented 3 months ago

And crashes.

moudahaddad14 commented 3 months ago

same but for me it crashes at 240/304

lllyasviel commented 3 months ago

update and try again

killerciao commented 3 months ago

update and gives this error while using flux1-dev-bnb-nf4-v2.safetensors: Patching LoRAs for KModel: 59%|█████▉ | 179/304 [00:02<00:02, 48.13it/s] ERROR lora diffusion_model.double_blocks.17.txt_mlp.2.weight CUDA out of memory. Tried to allocate 144.00 MiB. GPU Patching LoRA weights failed. Retrying by offload models. Patching LoRAs for KModel: 100%|██████████| 304/304 [00:05<00:00, 51.52it/s] if instead i use flux1-dev-fp8.safetensors, the lora load fine

lllyasviel commented 3 months ago

@killerciao does this give you image eventually? if it gives then i will just use this method

killerciao commented 3 months ago

@killerciao does this give you image eventually? if it gives then i will just use this method

yes with flux1-dev-bnb-nf4-v2.safetensors after the errors,the image is generated,however without the effects of the lora

Dampfinchen commented 3 months ago

Is there a reason why this process is done at each generation? One time would be okay, but it does it each time after I click generate, making Loras pretty unusable for me.

iqddd commented 3 months ago

Still crashes.

Patching LoRAs for KModel:  44%|███████████████████████▎                             | 134/304 [00:03<00:03, 44.72it/s] ERROR lora diffusion_model.double_blocks.13.txt_mod.lin.weight Allocation on device
Patching LoRA weights failed. Retrying by offloading models.
Press any Key to Continue . . .

When this happens, other applications may crash or glitch. If 'Enabled for UNet (always maximize offload)' is checked, OOM occurs with RAM instead of VRAM. (I have 32 GB of RAM)

evelryu commented 3 months ago

I have been using Forge to generate Flux (NF4 version) images, but when I use LoRAs in the prompt my computer freezes. This is the error:

Patching LoRAs for KModel: 52% 918 Patching LoRA weights failed. Retrying by offloading models. Patching LoRAs for KModel: 92% | 280/304 [00:23408:02, 9.31it/s] ERROR Lora diffusion_model.single_blocks. weight Allocation on device Patching LoRA weights failed. Retrying by offloading models.

The computer freezes after these lines. I can’t even take a screenshot, here's a photo: https://imgur.com/a/A10HvRt

The Lora is around 160mb

I have a 3060 12gb and 32gb ram.

Vinfamy-New commented 3 months ago

Same issue for me, can't use any Loras where it counts up to 304 (those that only count up to 76 work great though, fast patching too). Best I could get to was 270/304, then it stops counting, "Connection timeout" popup on the browser, then Press any key to continue ...

For the time being, does anyone know how to tell in advance if a lora would count up to 76 and not 304, so that we know which is useable rn? File size's not an indicator, had a 19mb lora count up to 304, yet a 51mb lora count up to 76 ...

Ps: Unfortunately, still the same problem after today (19/8) update. Tried with flux1-dev-fp8 though and patching was successful, so it's only an issue with flux1-dev-bnb-nf4-v2

AcademiaSD commented 3 months ago

Same issue for me,

YNWALALALA commented 3 months ago

I also can't use those lora which count to 304(flux1-dev-bnb-nf4-v2)

JuiceWill commented 3 months ago

+1 on this issue - crashes Remote Desktop session, and freezes the PC for me too.

Running on a 4090 with 24gb vram

Samael-1976 commented 3 months ago

I found on civitai small, 18 MB Flux lora. If I start with that lora first, patching fails at around step 278-288 out of 304 (same as always no matter what size of that lora ). But image being generated and lora actually works - I use face lora that base checkpoint doesn't know so I can be sure it worked. THEN I can generate 2nd image with other, large, even 800 MB lora. Suddenly even lora patching works fine all the way down to last step 304. and everything works as it should. So it's not completely dead :)

I've try your method, but with my 2060 and 12GB of vram, the patching arrive at 250 of 304, and after it crash

iqddd commented 3 months ago

The latest updates still don't fix the process. On Windows 11, 32 RAM and 16 VRAM it crashes due to OOM about midway through the process. Is there any way to reduce memory and video memory consumption? Or maybe there is a way to pre-convert LORA to the desired format on a more memory-capable system?

YNWALALALA commented 3 months ago

18MB Lora is basically to count to 304. I can now use Loras which is 36530kb, 47489kb, 51142kb. These loras can be counted normally.(Because they are only about 100 counts) The larger Lora also has problems with not being able to count to 304.

markdee3 commented 3 months ago

304 steps is where i get the issues, regardless of mb size, even on 18mb

Vinfamy-New commented 3 months ago

I found the solution: under "Diffusion in Low Bits" (center-top of your screen), change it to "Automatic (fp16 LoRA)". Patching Loras will get to 304/304 (100%) instantly! And yes, the lora's will then work, even with flux1-dev-bnb-nf4-v2 and flux1-schnell-bnb-nf4.

How is that not the default setting!?

Gravityhorse commented 3 months ago

I found the solution: under "Diffusion in Low Bits" (center-top of your screen), change it to "Automatic (fp16 LoRA)". Patching Loras will get to 304/304 (100%) instantly! And yes, the lora's will then work, even with flux1-dev-bnb-nf4-v2 and flux1-schnell-bnb-nf4.

How is that not the default setting!?

THIS IS THE SOLUTION