Closed MylesCroft closed 2 months ago
Thanks for reporting. Was the queue a single batch of generations with the same settings? If so, I this behaviour is expected, because each queue item will require similar amounts of VRAM. Either all or none should OOM.
The problem to investigate is if a single queue item OOM-ing breaks other queue items that would not, on their own, OOM.
Here's how I'm attempting to reproduce the problem:
Run a helper script that allocates a VRAM artificially, leaving me with ~6GB. Here's the script.
import sys
import torch
def allocate_vram(gb_to_allocate: float):
bytes_to_allocate = gb_to_allocate * 1024 * 1024 * 1024
assert torch.cuda.is_available(), "CUDA is not available"
# FloatTensor (4 bytes per element)
_tensor = torch.empty(int(bytes_to_allocate / 4), dtype=torch.float, device="cuda")
print(f"Allocated {gb_to_allocate} GB of VRAM")
input("Press Enter to release VRAM allocation and exit...")
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python allocate_vram.py <GB_to_allocate>")
sys.exit(1)
try:
gb_to_allocate = float(sys.argv[1])
except ValueError:
print("The argument must be a number representing the amount of VRAM in GB")
sys.exit(1)
allocate_vram(gb_to_allocate)
The app works as expected for me. The first queue item OOMs and the others generate without issue. The OOM is cleared successfully.
SDXL workflow - with a standard way of working:
Choose model - Starlight XL Animated, Zavy Chromax or Copax Timeless, Wildcard OG XL for example Currently four lora but I have batched up to six lora in the past with little issue. The lora I am currently is Midjourney 5.2, add-detail-xl, eldritch candids and vanta black contrast.
My usual workflow is to test batch a queue of five renders per model with a starting prompt, and then tweak from there to refine the image. I have attached the json of one of the renders that I was attempting.
I have tested this against previous successfully rendered images. They can render ok, but there is a higher chance of failure which has never happened previously, nor seeing the full queue ever imploding.
These are lora i have been using for several months now with little issues. I have kicked off another queue to see if I can induce a fault to get the json file from a failed render but will take a while, as the card is not the fastest on the planet.
The nVidia drivers used are: R550 U7 (552.74) on Win 11 with all patches.
Thanks. I have a suspicion of the cause. I've made a dev build of the app that implements a fix. The python wheel distribution is attached here.
InvokeAI-4.2.7.dev1-py3-none-any.whl.zip
Can you please test this out and see if it fixes the issue? Here's how to install the dev build:
.whl
).pip install path/to/wheel/InvokeAI-4.2.7.dev1-py3-none-any.whl
.Then start up Invoke and see if the problem persists.
To revert to the stable version:
pip uninstall invokeai
.pip install invokeai
.OK, from the testing I have done whatever trick you did seems to have fixed it. It is a lot more stable than before, with no crashes with nearly a hundred images pushed through the pipeline with up to six lora.
Thanks for your assistance.
Thanks for testing, I'm reopening this until the fix is released.
Is there an existing issue for this problem?
Operating system
Windows
GPU vendor
Nvidia (CUDA)
GPU model
P4000
GPU VRAM
8Gb
Version number
v4.2.6a1
Browser
na
Python dependencies
Local System accelerate
0.30.1
compel
2.0.2
cuda
12.1
diffusers
0.27.2
numpy
1.26.4
opencv
4.9.0.80
onnx
1.15.0
pillow
10.4.0
python
3.11.6
torch
2.2.2+cu121
torchvision
0.17.2+cu121
transformers
4.41.1
xformers
0.0.25.post1
What happened
Had left a batch of around 40 images to process in queue, when I returned the whole queue had crashed and required redoing all settings and prompts. attached the crash log.
[crash_log.txt](https://github.com/user-attachments/files/16222044/crash_log.txt)
What you expected to happen
queue to finish, or if failure to continue to next image
How to reproduce the problem
No response
Additional context
No response
Discord username
No response