[Linux] CUDA error upon trying to generate

ThorinWolf commented 2 years ago

OS: Arch Linux rolling GPU: GTX 1660 SUPER Driver: nvidia-520.56.06 CUDA: cuda-tools installed

Whenever I go to generate, it crashes on a CUDA error, as shown below. I have all dependencies listed plus a fair few others since it also pointed out I didn't have them. I'm using a local clone of v1.4 of the diffusion model renamed to v1.5 because the program wants to only accept a v1.5 folder despite it not being out for the public (as far as I can tell) and the HTTPX request fails when I go for the access key.

See terminal output below (the backslashes in the last line are to be ignored, interfered with code block):

$ python unstablefusion.py 
Generating with strength 0.75, steps 30, guidance_scale 7.5, seed 889492
  0%|                                                                                                                                                                 | 0/31 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/meteo/UnstableFusion/unstablefusion.py", line 912, in handle_generate_button
    image = self.get_handler().generate(prompt,
  File "/home/meteo/UnstableFusion/diffusionserver.py", line 114, in generate
    im = self.text2img(
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 326, in __call__
    noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 296, in forward
    sample, res_samples = downsample_block(
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/models/unet_blocks.py", line 563, in forward
    hidden_states = attn(hidden_states, context=encoder_hidden_states)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/models/attention.py", line 162, in forward
    hidden_states = block(hidden_states, context=context)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/models/attention.py", line 213, in forward
    hidden_states = self.ff(self.norm3(hidden_states)) + hidden_states
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/models/attention.py", line 344, in forward
    return self.net(hidden_states)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/diffusers/models/attention.py", line 362, in forward
    hidden_states, gate = self.proj(hidden_states).chunk(2, dim=-1)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/meteo/.local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling \`cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)\`

ahrm commented 2 years ago

Instead of renaming your local files, try changing this line: https://github.com/ahrm/UnstableFusion/blob/91d40dd7a085c8937ef7b7d1478c54ca50a7850d/diffusionserver.py#L34

to this: "CompVis/stable-diffusion-v1-4", and see if it helps.

ThorinWolf commented 2 years ago

Done, but no effect. Exact same error. And Torch reserving memory but not clearing it properly with each crash means I get out of memory errors if I try to do it for a second time without restarting.

ahrm commented 2 years ago

I assume it is a memory error? Maybe try enabling attention slicing and see if it helps: https://github.com/ahrm/UnstableFusion/pull/34/commits/65f95a6baa1d90061ed9cd16cf58e3994c52634b

ThorinWolf commented 2 years ago

No change, though it was faster to get to starting the generation. Monitoring CPU, memory, GPU & GPU memory via both htop & nvtop. No topping out on either of them, so I presume is not a memory issue.

ahrm commented 2 years ago

What is your cuda version?

ThorinWolf commented 2 years ago

NVCC:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

ahrm commented 2 years ago

Pytorch doesn't support cuda 11.8, the maximum supported version is 11.6.

ThorinWolf commented 2 years ago

Downgraded to:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Fri_Dec_17_18:16:03_PST_2021
Cuda compilation tools, release 11.6, V11.6.55
Build cuda_11.6.r11.6/compiler.30794723_0

Same error occurs. Can try and downgrade further if required, but it is 11.6.

ahrm / UnstableFusion

[Linux] CUDA error upon trying to generate #38