Hi, Looks like pytorch overcame my 24gb video memory
I don't know why this is happening. But i don't know how to fix it.
Selected timesteps: tensor([4, 0, 5, 2, 3, 6, 7, 1])
0%| | 0/10 [00:06<?, ?it/s, loss=0.24, indices=tensor([4, 0, 5, 2, 3, 6, 7, 1])]
Traceback (most recent call last):
File "estimate_CLIP_features.py", line 65, in <module>
output = invertor.perform_cond_inversion_individual_timesteps(file_path, None, optimize_tokens=True)
File "/home/bazza/src/DIA/ddim_invertor.py", line 290, in perform_cond_inversion_individual_timesteps
loss.backward()
File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/autograd/function.py", line 253, in apply
return user_fn(self, *args)
File "/home/bazza/src/DIA/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 139, in backward
input_grads = torch.autograd.grad(
File "/home/bazza/src/miniforge3/envs/dia_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 275, in grad
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I don't know why this is happening. But i don't know how to fix it.
Originally posted by @b4zz4 in https://github.com/subrtadel/DIA/issues/1#issuecomment-1666750792