Open Bookwald opened 3 months ago
That's a good idea. Using 2 GPUs for inference may help alleviate the issue of insufficient memory. You can infer the T2I model and LoRA separately on different GPUs, and then merge their inference results after each time step.
hi, @Bookwald , i have completed this in 2*24G (L4 GPU) env. imply unet and decode model in GPU:1, other modules in GPU:0.
I can run LLMs using VRAM from 2 GPUs. Will this be possible with OMG, especially with Automatic1111 or Comfy?