AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
139.64k stars 26.47k forks source link

[Bug]: RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. #12582

Open sys0pt opened 1 year ago

sys0pt commented 1 year ago

Is there an existing issue for this?

What happened?

SD not generating image. Tried diferent combinations, installers, also reinstalled windows/drivers/pytorch the same way i did on other Machines, where SD works fine. for some reason SD has problems with the GPU.

Steps to reproduce the problem

Setup: GPU: RTX 2070 Super (8GB)

What should have happened?

Generate image

Version or Commit where the problem happens

v1.5.1

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

Cross attention optimization

InvokeAI

What browsers do you use to access the UI ?

Microsoft Edge

Command Line Arguments

--xformers --device-id 0 --no-half --opt-split-attention --opt-channelslast --opt-sdp-no-mem-attention --administrator --always-batch-cond-uncond --precision full

List of extensions

none

Console logs

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Loading A111 WebUI Launcher
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 i   Settings file found, loading
 →   Updating Settings File  ✓
 i   Launcher Version 1.7.0
 i   Found a custom WebUI Config
 i   No Launcher launch options
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 →   Checking requirements :
 i   Python 3.10.6150.1013 found in registry:  C:\Users\ai\AppData\Local\Programs\Python\Python310\
 i   Clearing PATH of any mention of Python
 →   Adding python 3.10 to path  ✓
 i   Git found and already in PATH:  C:\Program Files\Git\cmd\git.exe
 i   Automatic1111 SD WebUI found:  C:\Users\ai\Desktop\SD\stable-diffusion-webui
 i   One or more checkpoint models were found
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Loading Complete, opening launcher
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 i   Arguments are now: --device-id 0 --no-half --opt-split-attention --opt-channelslast --opt-sdp-no-mem-attention --administrator --always-batch-cond-uncond
 i   Enable Xformers updated to True
 →   Updating Settings File  ✓
 i   Arguments are now: --xformers --device-id 0 --no-half --opt-split-attention --opt-channelslast --opt-sdp-no-mem-attention --administrator --always-batch-cond-uncond
 i   Additional Args updated
 →   Updating Settings File  ✓
 i   Arguments are now: --xformers --device-id 0 --no-half --opt-split-attention --opt-channelslast --opt-sdp-no-mem-attention --administrator --always-batch-cond-uncond --precision full
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ↺   Updating Webui
 ✓   Done
 i   There is no extension in the extensions folder
 i   Arguments are now: --xformers --device-id 0 --no-half --opt-split-attention --opt-channelslast --opt-sdp-no-mem-attention --administrator --always-batch-cond-uncond --precision full
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  WEBUI LAUNCHING VIA EMS LAUNCHER, EXIT THIS WINDOW TO STOP THE WEBUI
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 !   Any error happening after 'commit hash : XXXX' is not related to the launcher. Please report them on Automatic1111's github instead :
 ☁   https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/new/choose
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Cancel
venv "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.5.1
Commit hash: 68f336bd994bed5442ad95bad6b6ad5564a5409a
Launching Web UI with arguments: --autolaunch --xformers --device-id 0 --no-half --opt-split-attention --opt-channelslast --opt-sdp-no-mem-attention --administrator --always-batch-cond-uncond --precision full
Loading weights [27a4ac756c] from C:\Users\ai\Desktop\SD\stable-diffusion-webui\models\Stable-diffusion\SD15NewVAEpruned.ckpt
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 30.4s (launcher: 11.2s, import torch: 7.0s, import gradio: 2.9s, setup paths: 3.5s, other imports: 2.9s, setup codeformer: 0.1s, load scripts: 1.4s, create ui: 0.7s, gradio launch: 0.5s).
Creating model from config: C:\Users\ai\Desktop\SD\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying attention optimization: Doggettx... done.
Model loaded in 34.5s (load weights from disk: 8.2s, create model: 17.8s, apply weights to model: 1.0s, apply channels_last: 0.9s, move model to device: 6.1s, calculate empty prompt: 0.3s).
  0%|                                                                                        | 0/20 [00:01<?, ?it/s]Exception in thread MemMon:
Traceback (most recent call last):
  File "C:\Users\ai\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner

*** Error completing request
*** Arguments: ('task(623ht1rllkhdrvd)', 'forest, trees, sunny day', 'low quality, ugly, rainy', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 0, '', '', [], <gradio.routes.Request object at 0x00000224457C1D80>, 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0) {}
    self.run()
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\memmon.py", line 53, in run
    free, total = self.cuda_mem_get_info()
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\memmon.py", line 34, in cuda_mem_get_info
    return torch.cuda.mem_get_info(index)
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\memory.py", line 618, in mem_get_info
    return torch.cuda.cudart().cudaMemGetInfo(device)
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

    Traceback (most recent call last):
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\call_queue.py", line 58, in f
        res = list(func(*args, **kwargs))
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\call_queue.py", line 37, in f
        res = func(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\txt2img.py", line 62, in txt2img
        processed = processing.process_images(p)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\processing.py", line 677, in process_images
        res = process_images_inner(p)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\processing.py", line 794, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\processing.py", line 1054, in sample
        samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 464, in sample
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 303, in launch_sampling
        return func()
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 464, in <lambda>
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 183, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
        return self.__orig_func(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
        x_recon = self.model(x_noisy, t, **cond)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward
        out = self.diffusion_model(x, t, context=cc)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1538, in _call_impl
        result = forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\sd_unet.py", line 91, in UNetModel_forward
        return ldm.modules.diffusionmodules.openaimodel.copy_of_UNetModel_forward_for_webui(self, x, timesteps, context, *args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 797, in forward
        h = module(h, emb, context)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 84, in forward
        x = layer(x, context)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\attention.py", line 329, in forward
        x = self.proj_in(x)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 376, in network_Conv2d_forward
        return torch.nn.Conv2d_forward_before_network(self, input)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
    RuntimeError: CUDA error: an illegal memory access was encountered
    CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

---
Traceback (most recent call last):
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\call_queue.py", line 93, in f
    mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()}
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\memmon.py", line 92, in stop
    return self.read()
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\memmon.py", line 77, in read
    free, total = self.cuda_mem_get_info()
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\modules\memmon.py", line 34, in cuda_mem_get_info
    return torch.cuda.mem_get_info(index)
  File "C:\Users\ai\Desktop\SD\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\memory.py", line 618, in mem_get_info
    return torch.cuda.cudart().cudaMemGetInfo(device)
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Additional information

No response

GameKyuubi commented 1 year ago

"me too" with 1080Ti

YUANMU227 commented 1 year ago

Sometimes, this error happens because of OOM. Maybe you can solve it with a small batchsize or other ways to decrease the use of VRAM.

RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile withTORCH_USE_CUDA_DSAto enable device-side assertions.

JustaLampPost commented 10 months ago

Same error on a RTX 4070 laptop and on a RTX2070 Desktop pc. + In the event viewer multiple nvlddmkm error event ID 13:

\Device\Video8 Graphics SM Warp Exception on (GPC 2, TPC 5, SM 1): Illegal Instruction Encoding

fdefake commented 6 months ago

Did you find a solution?

alex-des-santos commented 6 months ago

Yeap. Same error here. RTX 4060 8GB VRAM