lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
8.64k stars 853 forks source link

odd vae decoding behaviour when using fp16 flux +fp16 t5xxl #1768

Open Hugs288 opened 2 months ago

Hugs288 commented 2 months ago

when using fp16 flux + fp16 t5xxl, generating works just fine but when it reaches the vae decoding stage, it allocates like 10 gb of my ram, overflowing the ram into the ssd and making the pc super unresponsive just for the vae and i have no idea why, the vae is like 200 mb it should have no problems.

also after the vae has finally decoded all models get unloaded from ram and vram, making it so i have to load the models again for generating another image.

i do not believe this to be a memory problem since i have rtx 4090 and 32 gb of ram, and generating speed is perfectly fine.

with the exact same setup except with nf4 flux instead of fp16 everything works perfectly fine.

Juqowel commented 2 months ago

What resolution are you trying to generate? Have you tried to enable Never OOM Integrated - Enabled for VAE (always tiled)?