lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.87k stars 192 forks source link

[Bug]: RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action #237

Open duclong2502 opened 1 year ago

duclong2502 commented 1 year ago

Is there an existing issue for this?

What happened?

When I tried to launch webui with --opt-sub-quad-attention --lowvram --disable-nan-check,: RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.

Steps to reproduce the problem

run webui

What should have happened?

image

Version or Commit where the problem happens

1.5.1

What Python version are you running on ?

None

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

AMD GPUs

Cross attention optimization

Automatic

What browsers do you use to access the UI ?

No response

Command Line Arguments

No

List of extensions

No

Console logs

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: 1.5.1
Commit hash: 58b9fb5f1db36ea8335b7553b5391e4ec1e53393
Launching Web UI with arguments: --opt-sub-quad-attention --lowvram --disable-nan-check
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Warning: caught exception 'Something went wrong.', memory monitor disabled
Loading weights [ee10b26e6c] from D:\Stable Diffusion\stable-diffusion-webui-directml\models\Stable-diffusion\coloringBook_coloringBook.ckpt
Exception in thread Thread-17 (first_time_calculation):
Traceback (most recent call last):
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-directml\modules\devices.py", line 189, in first_time_calculation
    x = torch.zeros((1, 1, 3, 3)).to(device, dtype)
RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 14.2s (launcher: 1.0s, import torch: 5.6s, import gradio: 1.4s, setup paths: 1.2s, other imports: 2.3s, load scripts: 1.5s, create ui: 0.8s, gradio launch: 0.3s).
Creating model from config: D:\Stable Diffusion\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\configs\stable-diffusion\v2-inference-v.yaml
LatentDiffusion: Running in v-prediction mode
DiffusionWrapper has 865.91 M params.

Additional information

No response

duclong2502 commented 1 year ago

Any one help me?

Grey3016 commented 1 year ago

I had this error come up after my device went to sleep whilst running SD - try changing your power plan to another plan. It also might be your chipset drivers (as they install power plans) , try reinstalling them. Or your card drivers are damaged, ddu them and reinstall.

exa211 commented 1 year ago

Hi, same here with an RX 6700xt. Maybe has something todo with exceeding the cards max VRAM?

exa211 commented 1 year ago

Hi, same here with an RX 6700xt. Maybe has something todo with exceeding the cards max VRAM?

So after a bit of playing with the settings, i found out that if i disable the full resolution preview of the generated image. (Under Live previews) image It will (maybe) properly work again without suspending the GPU. I dont know if i'm just lucky with the UI not supending the GPU or it actually fixes the problem.