Closed liusida closed 1 month ago
Hey! Wow that looks like a lot indeed! Personally I've built upon Hugging Face's diffusers API, and I think they have put in place several parameters to lower VRAM consumption, such as pipe.enable_cpu_offload(), that sort of thing. I can for example use SDXL on a 8Gb GPU thanks to that kind of flags!
By the way you can also offload some of the work through your RAM by using Deepspeed, but I found that library kind of difficult to install, I've not investigated it much yet
I can for example use SDXL on a 8Gb GPU thanks to that kind of flags!
That's promising~!
It is! I was planning to create a Diffusers notebook sometime this week to keep track of all my experimentations on this subject, I'll share the repo with you if you're interested!
sure, please share it with me!
P.S. I am going to Europe for a family trip and will be back to coding in about a month.
Okay have fun!! I'll keep you posted when I've started the notebook, it gives me some time to do it before you're back ^^'
When I looked at RunwayML's Stable Diffusion (the official implementation, right?), it takes 10GB of VRAM to run a simple SD1.5. Comfy can do this with only 1GB of VRAM by only loading the necessary models into VRAM and leaving the others in RAM.
Can diffusers manage memory in a similar way to reduce VRAM requirements? I am not very familiar with diffusers and am just curious.