Extreme slowness when running lora in flux bnb nf4 v2

lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0

7.4k stars 717 forks source link

Extreme slowness when running lora in flux bnb nf4 v2 #1296

Open elyzionz opened 4 weeks ago

elyzionz commented 4 weeks ago

reporte

I have the problem that when I add a lora the inference time increases extremely, but sometimes the inference time is reduced to the normal speed using the same lora, what would be the cause of this problem?

Yesterday it took me more than 15 minutes to process a single image after creating several images the inference time was reduced to 55 seconds which is the normal speed without inference but when I restarted the forge it went up again to more than 15 minutes for each image I generate with the same lora

lllyasviel commented 4 weeks ago

do not set GPU weight to max

https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1181

and, before trying Async or Shared swap, use default ones first

elyzionz commented 4 weeks ago

No configure el peso de la GPU al máximo

1181

y, antes de probar Async o Shared swap, use primero los predeterminados

I reduced 1.5gb to the gpu but I still have the same problem, the strange thing is that by trying many times the inference time is reduced to normal speed using the same lora reporte2

Here are the console times and my configuration

I tried using 10 steps with a 512x512 resolution

My hardware is a 3060 with 12gb vram 32gb ram memory

adde88 commented 4 weeks ago

I had a go with Flux today, i have a Ryzen 5 3600 6 Core CPU, 32GB RAM, RTX-2060 with 6GB VRAM, and the model i chose was: "FLUX.1-schnell-dev-merged-fp8-4step". The model is approx 22GB. And i was blown away how fast it was. 4-6 seconds average on a 584x1040 image with only 4 steps. And the output was above anything i've created before!

Screnshot just to show, mind you the time it took here was including loading the model for the first time. After that it goes insanely fast. But some commits were added later today that broke this process giving a strange error, so i had to go back to commit: d7151b4dcdafaf9e96e4f0c53b4bc94f6b0e92a7

elyzionz commented 3 weeks ago

I solved my problem by enabling Never OOM integrated

rlewisfr commented 3 weeks ago

I solved my problem by enabling Never OOM integrated

Thank you! Was having the same issue and couldn't figure out why some of my LORA runs were 90+s/it