I'm getting an OOM error on a 4090 which I find weird since a lot of people are able to run this on lower vram gpu's. Was just trying to do a test run with like 15 images. It will run for like 5-10 steps, Then it crashes. I have 32gb of system ram, 24 of vram.
Logs
# ComfyUI Error Report
## Error Details
- **Node Type:** FluxTrainLoop
- **Exception Type:** torch.cuda.OutOfMemoryError
- **Exception Message:** Allocation on device
## Stack Trace
File "C:\StableDiffusion\Flux_Comfy\ComfyUI\execution.py", line 317, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "C:\StableDiffusion\Flux_Comfy\ComfyUI\execution.py", line 192, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "C:\StableDiffusion\Flux_Comfy\ComfyUI\execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "C:\StableDiffusion\Flux_Comfy\ComfyUI\execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "C:\StableDiffusion\Flux_Comfy\ComfyUI\custom_nodes\ComfyUI-FluxTrainer\nodes.py", line 746, in train
steps_done = training_loop(
File "C:\StableDiffusion\Flux_Comfy\ComfyUI\custom_nodes\ComfyUI-FluxTrainer\train_network.py", line 1198, in training_loop
accelerator.backward(loss)
File "C:\StableDiffusion\Flux_Comfy\venv\lib\site-packages\accelerate\accelerator.py", line 2159, in backward
loss.backward(**kwargs)
File "C:\StableDiffusion\Flux_Comfy\venv\lib\site-packages\torch\_tensor.py", line 525, in backward
torch.autograd.backward(
File "C:\StableDiffusion\Flux_Comfy\venv\lib\site-packages\torch\autograd\__init__.py", line 267, in backward
_engine_run_backward(
File "C:\StableDiffusion\Flux_Comfy\venv\lib\site-packages\torch\autograd\graph.py", line 744, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Your question
I'm getting an OOM error on a 4090 which I find weird since a lot of people are able to run this on lower vram gpu's. Was just trying to do a test run with like 15 images. It will run for like 5-10 steps, Then it crashes. I have 32gb of system ram, 24 of vram.
Logs
System Information
ComfyUI Version: v0.2.2-1-gc27ebeb
Arguments: main.py --listen
OS: nt
Python Version: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Embedded Python: false
PyTorch Version: 2.3.0+cu121
Devices
Name: cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync
Logs
Attached Workflow
Please make sure that workflow does not contain any sensitive information such as API keys or passwords.
Additional Context
(Please add any additional context or steps to reproduce the error here)