subrtadel / DIA

MIT License
18 stars 3 forks source link

CUDA error: out of memory #4

Closed subrtadel closed 1 year ago

subrtadel commented 1 year ago
          Hi, Looks like pytorch overcame my 24gb video memory

I don't know why this is happening. But i don't know how to fix it.

Selected timesteps: tensor([4, 0, 5, 2, 3, 6, 7, 1])
  0%|                                                                             | 0/10 [00:06<?, ?it/s, loss=0.24, indices=tensor([4, 0, 5, 2, 3, 6, 7, 1])]
Traceback (most recent call last):
  File "estimate_CLIP_features.py", line 65, in <module>
    output = invertor.perform_cond_inversion_individual_timesteps(file_path, None, optimize_tokens=True)
  File "/home/bazza/src/DIA/ddim_invertor.py", line 290, in perform_cond_inversion_individual_timesteps
    loss.backward()
  File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/autograd/function.py", line 253, in apply
    return user_fn(self, *args)
  File "/home/bazza/src/DIA/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 139, in backward
    input_grads = torch.autograd.grad(
  File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 275, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Originally posted by @b4zz4 in https://github.com/subrtadel/DIA/issues/1#issuecomment-1666750792

subrtadel commented 1 year ago

The optimization is quite heavy on the resources. You can try to reduce the batch size in the config file parameter_estimation.yaml .