Patching LoRAs for KModel slow with Automatic (fp16 LoRA).

lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0

8.4k stars 817 forks source link

Patching LoRAs for KModel slow with Automatic (fp16 LoRA). #1569

Open ZeroCool22 opened 2 months ago

ZeroCool22 commented 2 months ago

Screenshot_2

Patching LoRA by precomputing model weights.
Patching LoRAs for KModel:  94%|█████████████████████████████████████████████████▊   | 286/304 [01:16<00:02,  6.48it/s]ERROR lora diffusion_model.single_blocks.32.linear1.weight Allocation on device
Patching LoRA weights failed. Retrying by offloading models.
Patching LoRAs for KModel: 100%|█████████████████████████████████████████████████████| 304/304 [01:58<00:00,  2.56it/s]
LoRA patching has taken 118.75 seconds
Moving model(s) has taken 130.87 seconds

kalle07 commented 2 months ago

i have also 16GB for me works this checkpoine fine https://civitai.com/models/638187?modelVersionId=721627

maybe try to choose the "Diffusion in Low Bits" on top not automatic but 16bit !

ZeroCool22 commented 2 months ago

maybe try to choose the "Diffusion in Low Bits" on top not automatic but 16bit !

It's already selected, check the image again.

kalle07 commented 2 months ago

i said NOT automatic try all 3 options in 16bit and may one work with the checkpoint above ... but anyway FLUX is still not that stable ...

ZeroCool22 commented 2 months ago

Screenshot_3

@lllyasviel

Why the patching is so damn slow?

Isn't supposed if we use Automatic (fp16 LoRA) the patching must be almost instantly?

kalle07 commented 2 months ago

2 weeks ago for me worked if i reduce in top line the "GPU weight" -> 7000 MB GGUF and LORA isnt yet very well programmed .... BE PATIENT ! anyway flux isnt that good at the moment, lets say only in some small parts

ZeroCool22 commented 2 months ago

2 weeks ago for me worked if i reduce in top line the "GPU weight" -> 7000 MB GGUF and LORA isnt yet very well programmed .... BE PATIENT ! anyway flux isnt that good at the moment, lets say only in some small parts

I understand, but i think FLUX is amazing, the Inpainting give the most perfect results i have seen so far and the best part, is you are not obligated to do a min. res of 1024x1024 (neither on normal txt2img) like SDXL needs. It's amazing that you can get perfect results at any resolution. That make things so much faster.

ZeroCool22 commented 2 months ago

reduce in top line the "GPU weight" -> 7000 MB GGUF

But doing that, you are not taking the benefit of all your VRAM (dunno what GPU you have).

kalle07 commented 2 months ago

16GB RTX and if you read the CMD lines it tryes to free/clean the vram ... for what ever ... and if that fails it lora-patch was slow ... but this was 2 weeks ago ... now with the checkpoint i mentioned is all ok

ZeroCool22 commented 2 months ago

16GB RTX and if you read the CMD lines it tryes to free/clean the vram ... for what ever ... and if that fails it lora-patch was slow ... but this was 2 weeks ago ... now with the checkpoint i mentioned is all ok

It shouldn't matter what checkpoint you use.

Anyway, now seems to work fine again....