[Bug]: RuntimeError: The GPU device instance has been suspended when using autocast precision

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

When I tried to launch webui with --lowvram --precision autocast, I got error during vae decoder in txt2img: RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.

I wonder whether current fp16 is available ? https://github.com/lshqqytiger/stable-diffusion-webui-directml/blob/master/modules/devices.py#L200-L240 And is this part of the code useful, can it help improve the quality of the fp16 conference?

Steps to reproduce the problem

I just changed webui-user.bat as:

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--api --lowvram --precision autocast

call webui.bat

What should have happened?

inference successfully.

Commit where the problem happens

Latest version

What platforms do you use to access the UI ?

No response

What browsers do you use to access the UI ?

No response

Command Line Arguments

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--api --lowvram --precision autocast

call webui.bat

List of extensions

Console logs

Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0):
Model loaded in 53.9s (load weights from disk: 0.1s, create model: 0.6s, apply weights to model: 12.7s, apply half(): 2.9s, move model to device: 37.4s).
Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://9f72eb6a-6870-425a.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
Startup time: 136.1s (import gradio: 4.3s, import ldm: 1.7s, other imports: 3.3s, setup codeformer: 0.3s, load scripts: 1.6s, load SD checkpoint: 54.3s, create ui: 0.4s, gradio launch: 70.1s, scripts app_started_callback: 0.1s).
index debug:::
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [17:28<00:00, 52.42s/it]
Error completing request███████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [16:58<00:00, 38.12s/it]
Arguments: ('task(kepz361926z9mgh)', 'cat and dog', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 637, in process_images_inner
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 637, in <listcomp>
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\processing.py", line 423, in decode_first_stage
    x = model.decode_first_stage(x)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage
    return self.first_stage_model.decode(z)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\lowvram.py", line 52, in first_stage_model_decode_wrap
    return first_stage_model_decode(z)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode
    dec = self.decoder(z)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 649, in forward
    h = self.conv_out(h)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\extensions-builtin\Lora\lora.py", line 182, in lora_Conv2d_forward
    return lora_forward(self, input, torch.nn.Conv2d_forward_before_lora(self, input))
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\modules\devices.py", line 245, in forward
    return super().forward(x.float()).type(x.dtype)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\Users\ruonanw1\OneDrive - Intel Corporation\Desktop\code\sd-webui-directml\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.

Additional information

No response

lshqqytiger / stable-diffusion-webui-amdgpu

[Bug]: RuntimeError: The GPU device instance has been suspended when using autocast precision #70

Is there an existing issue for this?

What happened?

Steps to reproduce the problem

What should have happened?

Commit where the problem happens

What platforms do you use to access the UI ?

What browsers do you use to access the UI ?

Command Line Arguments

List of extensions

Console logs

Additional information