lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.74k stars 179 forks source link

[Bug]: RuntimeError: The GPU device instance has been suspended when using autocast precision #70

Open rnwang04 opened 1 year ago

rnwang04 commented 1 year ago

Is there an existing issue for this?

What happened?

When I tried to launch webui with --lowvram --precision autocast, I got error during vae decoder in txt2img: RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.

I wonder whether current fp16 is available ? https://github.com/lshqqytiger/stable-diffusion-webui-directml/blob/master/modules/devices.py#L200-L240 And is this part of the code useful, can it help improve the quality of the fp16 conference?

Steps to reproduce the problem

I just changed webui-user.bat as:

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--api --lowvram --precision autocast

call webui.bat

What should have happened?

inference successfully.

Commit where the problem happens

Latest version

What platforms do you use to access the UI ?

No response

What browsers do you use to access the UI ?

No response

Command Line Arguments

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--api --lowvram --precision autocast

call webui.bat

List of extensions

No

Console logs

Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0):
Model loaded in 53.9s (load weights from disk: 0.1s, create model: 0.6s, apply weights to model: 12.7s, apply half(): 2.9s, move model to device: 37.4s).
Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://9f72eb6a-6870-425a.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
Startup time: 136.1s (import gradio: 4.3s, import ldm: 1.7s, other imports: 3.3s, setup codeformer: 0.3s, load scripts: 1.6s, load SD checkpoint: 54.3s, create ui: 0.4s, gradio launch: 70.1s, scripts app_started_callback: 0.1s).
index debug:::
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [17:28<00:00, 52.42s/it]
Error completing request███████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [16:58<00:00, 38.12s/it]
Arguments: ('task(kepz361926z9mgh)', 'cat and dog', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 637, in process_images_inner
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 637, in <listcomp>
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 423, in decode_first_stage
    x = model.decode_first_stage(x)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage
    return self.first_stage_model.decode(z)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\lowvram.py", line 52, in first_stage_model_decode_wrap
    return first_stage_model_decode(z)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode
    dec = self.decoder(z)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 649, in forward
    h = self.conv_out(h)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\extensions-builtin\Lora\lora.py", line 182, in lora_Conv2d_forward
    return lora_forward(self, input, torch.nn.Conv2d_forward_before_lora(self, input))
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\devices.py", line 245, in forward
    return super().forward(x.float()).type(x.dtype)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.

Additional information

No response

lshqqytiger commented 1 year ago

Strictly speaking, FP16 is available. But, when I tested, there were some errors. (type mismatch between float16 and float32. When I tried to match type to float16, it generated strange images) So I added .float() in order that you can generate images without --no-half. (this is why I recommend to use --no-half) And I tested txt2img with same commandline arguments, the RuntimeError you described doesn't occur for me.