Closed SuperSonicHub1 closed 7 months ago
Yes, I've noticed this too whenever using multi-GPU machines - system RAM usage is very high during model swaps; that's just how they're implemented in Auto1111.
An enhancement that's been on my list forever that I haven't had time to implement yet is to optionally queue model swaps in multi-GPU rigs so that only one GPU is ever loading a new model at once (so you won't run into issues where say 3 GPUs all need to load a new model at once and system RAM usage spikes through the roof). I don't currently have access to a multi-GPU rig for testing, so implementation will have to wait awhile.
Short term, the best option (short of adding more physical RAM) is to just increase your swap file to handle the worst-case scenario. In the past, I've found that ~48GB of physical RAM was enough to never hit the swap file in a 3 GPU system running SDXL models under Linux, but your mileage may vary and 64GB is probably safer, especially if you're running other stuff on the box.
Thanks for the advice. Closing this issue.
Have a 3x1070 Ti setup on Ubuntu 22.04 with 8 GB of RAM and am facing extreme memory usage and all the issues that come with it. Was forced to significantly increase my machine's swap. Is this a skill issue on my part ("just buy more RAM"), or could we be more efficiently sending assets from disk to the GPU through methods like streaming or aggressively GCing? Screenshots of RAM usage attached: