lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.88k stars 192 forks source link

[Bug]: invalid argument to memory_allocated #483

Closed Pinoles17 closed 4 months ago

Pinoles17 commented 4 months ago

Checklist

What happened?

Can't generate

Steps to reproduce the problem

I made a clean install, i'm using set COMMANDLINE_ARGS=--use-zluda --medvram

What should have happened?

Generate

What browsers do you use to access the UI ?

Brave

Sysinfo

sysinfo-2024-06-30-05-18.json

Console logs

venv "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
WARNING: ZLUDA works best with SD.Next. Please consider migrating to SD.Next.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.9.3-amd-28-g371f53ed
Commit hash: 371f53ed7c926f9048ef95f45bc816cfbf37b564
Using ZLUDA in E:\SD2.0\SD\webui\stable-diffusion-webui-directml\.zluda
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
Launching Web UI with arguments: --use-zluda --medvram
E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\diffusers\models\vq_model.py:20: FutureWarning: `VQEncoderOutput` is deprecated and will be removed in version 0.31. Importing `VQEncoderOutput` from `diffusers.models.vq_model` is deprecated and this will be removed in a future version. Please use `from diffusers.models.autoencoders.vq_model import VQEncoderOutput`, instead.
  deprecate("VQEncoderOutput", "0.31", deprecation_message)
E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\diffusers\models\vq_model.py:25: FutureWarning: `VQModel` is deprecated and will be removed in version 0.31. Importing `VQModel` from `diffusers.models.vq_model` is deprecated and this will be removed in a future version. Please use `from diffusers.models.autoencoders.vq_model import VQModel`, instead.
  deprecate("VQModel", "0.31", deprecation_message)
ONNX: version=1.18.1 provider=CUDAExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider']
ZLUDA device failed to pass basic operation test: index=None, device_name=AMD Radeon RX 6600 XT [ZLUDA]
CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Loading weights [6ce0161689] from E:\SD2.0\SD\webui\stable-diffusion-webui-directml\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors
Creating model from config: E:\SD2.0\SD\webui\stable-diffusion-webui-directml\configs\v1-inference.yaml
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 61.8s (prepare environment: 93.5s, initialize shared: 7.6s, other imports: 0.1s, list SD models: 0.2s, load scripts: 1.2s, initialize extra networks: 1.0s, scripts before_ui_callback: 0.1s, create ui: 2.5s, gradio launch: 1.4s).
E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Restarting UI...
Applying attention optimization: InvokeAI... done.
Model loaded in 211.5s (load weights from disk: 4.6s, create model: 1.2s, apply weights to model: 194.0s, apply half(): 1.8s, apply dtype to VAE: 0.1s, load textual inversion embeddings: 0.1s, calculate empty prompt: 9.4s).
Closing server running on port: 7860
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 1.1s (load scripts: 0.5s, create ui: 0.4s, gradio launch: 0.1s).
Reusing loaded model v1-5-pruned-emaonly.safetensors [6ce0161689] to load counterfeitV30_v30.safetensors
Calculating sha256 for E:\SD2.0\SD\webui\stable-diffusion-webui-directml\models\Stable-diffusion\counterfeitV30_v30.safetensors: 08f92e8480674edced076481075d78407b7bb5c3735ee090bccdff83a188a11f
Loading weights [08f92e8480] from E:\SD2.0\SD\webui\stable-diffusion-webui-directml\models\Stable-diffusion\counterfeitV30_v30.safetensors
Applying attention optimization: InvokeAI... done.
Weights loaded in 49.1s (calculate hash: 47.2s, apply weights to model: 1.7s).
Exception in thread MemMon:
Traceback (most recent call last):
  File "C:\Users\enriq\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 43, in run
    torch.cuda.reset_peak_memory_stats()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 309, in reset_peak_memory_stats
                                      return torch._C._cuda_resetPeakMemoryStats(device)
RuntimeError: invalid argument to reset_peak_memory_stats
*** Error completing request
*** Arguments: ('task(6yrqvt1zltimai2)', <gradio.routes.Request object at 0x000001BCFEF6FD30>, 'cat girl', '', [], 1, 1, 7, 512, 512, True, 0.7, 1.5, 'R-ESRGAN 4x+ Anime6B', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 30, 'Euler a', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\txt2img.py", line 109, in txt2img
        processed = processing.process_images(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 847, in process_images
        res = process_images_inner(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1075, in process_images_inner        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1393, in sample
        self.sampler = sd_samplers.create_sampler(self.sampler_name, self.sd_model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers.py", line 41, in create_sampler
        sampler = config.constructor(model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 31, in <lambda>
        sd_samplers_common.SamplerData(label, lambda model, funcname=funcname: KDiffusionSampler(funcname, model), aliases, options)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 72, in __init__
        self.model_wrap = self.model_wrap_cfg.inner_model
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 57, in inner_model
        self.model_wrap = denoiser(shared.sd_model, quantize=shared.opts.enable_quantization)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 135, in __init__
        super().__init__(model, model.alphas_cumprod, quantize=quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 92, in __init__
        super().__init__(((1 - alphas_cumprod) / alphas_cumprod) ** 0.5, quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 40, in wrapped
        return f(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 966, in __rsub__
        return _C._VariableFunctions.rsub(self, other)
    RuntimeError: CUDA error: operation not supported
    CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

---
Traceback (most recent call last):
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 95, in f
    mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()}
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 99, in stop
    return self.read()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 81, in read
    torch_stats = torch.cuda.memory_stats(self.device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 258, in memory_stats
    stats = memory_stats_as_nested_dict(device=device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 270, in memory_stats_as_nested_dict
    return torch._C._cuda_memoryStats(device)
RuntimeError: invalid argument to memory_allocated
*** Error completing request
*** Arguments: ('task(fvjfhpzyakldt8b)', <gradio.routes.Request object at 0x000001BCFF1A7AF0>, 'cat girl', '', [], 1, 1, 7, 512, 512, True, 0.7, 1.5, 'R-ESRGAN 4x+ Anime6B', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 30, 'Euler a', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\txt2img.py", line 109, in txt2img
        processed = processing.process_images(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 847, in process_images
        res = process_images_inner(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1075, in process_images_inner        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1393, in sample
        self.sampler = sd_samplers.create_sampler(self.sampler_name, self.sd_model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers.py", line 41, in create_sampler
        sampler = config.constructor(model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 31, in <lambda>
        sd_samplers_common.SamplerData(label, lambda model, funcname=funcname: KDiffusionSampler(funcname, model), aliases, options)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 72, in __init__
        self.model_wrap = self.model_wrap_cfg.inner_model
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 57, in inner_model
        self.model_wrap = denoiser(shared.sd_model, quantize=shared.opts.enable_quantization)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 135, in __init__
        super().__init__(model, model.alphas_cumprod, quantize=quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 92, in __init__
        super().__init__(((1 - alphas_cumprod) / alphas_cumprod) ** 0.5, quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 40, in wrapped
        return f(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 966, in __rsub__
        return _C._VariableFunctions.rsub(self, other)
    RuntimeError: CUDA error: invalid argument
    CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

---
Traceback (most recent call last):
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 95, in f
    mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()}
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 99, in stop
    return self.read()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 81, in read
    torch_stats = torch.cuda.memory_stats(self.device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 258, in memory_stats
    stats = memory_stats_as_nested_dict(device=device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 270, in memory_stats_as_nested_dict
    return torch._C._cuda_memoryStats(device)
RuntimeError: invalid argument to memory_allocated
Applying attention optimization: Doggettx... done.
Loading weights [08f92e8480] from E:\SD2.0\SD\webui\stable-diffusion-webui-directml\models\Stable-diffusion\counterfeitV30_v30.safetensors
Applying attention optimization: Doggettx... done.
Weights loaded in 1.4s (apply weights to model: 1.2s).
*** Error completing request
*** Arguments: ('task(cfgtk0nziept7u8)', <gradio.routes.Request object at 0x000001BCFF128640>, 'cat girl', '', [], 1, 1, 7, 512, 512, True, 0.7, 1.5, 'R-ESRGAN 4x+ Anime6B', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 30, 'Euler a', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\txt2img.py", line 109, in txt2img
        processed = processing.process_images(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 847, in process_images
        res = process_images_inner(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1075, in process_images_inner        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1393, in sample
        self.sampler = sd_samplers.create_sampler(self.sampler_name, self.sd_model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers.py", line 41, in create_sampler
        sampler = config.constructor(model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 31, in <lambda>
        sd_samplers_common.SamplerData(label, lambda model, funcname=funcname: KDiffusionSampler(funcname, model), aliases, options)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 72, in __init__
        self.model_wrap = self.model_wrap_cfg.inner_model
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 57, in inner_model
        self.model_wrap = denoiser(shared.sd_model, quantize=shared.opts.enable_quantization)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 135, in __init__
        super().__init__(model, model.alphas_cumprod, quantize=quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 92, in __init__
        super().__init__(((1 - alphas_cumprod) / alphas_cumprod) ** 0.5, quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 40, in wrapped
        return f(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 966, in __rsub__
        return _C._VariableFunctions.rsub(self, other)
    RuntimeError: CUDA error: invalid argument
    CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

---
Traceback (most recent call last):
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 95, in f
    mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()}
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 99, in stop
    return self.read()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 81, in read
    torch_stats = torch.cuda.memory_stats(self.device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 258, in memory_stats
    stats = memory_stats_as_nested_dict(device=device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 270, in memory_stats_as_nested_dict
    return torch._C._cuda_memoryStats(device)
RuntimeError: invalid argument to memory_allocated
*** Error completing request
*** Arguments: ('task(yxjj14ktf7nd4hl)', <gradio.routes.Request object at 0x000001BCFF12B580>, 'cat girl', '', [], 1, 1, 7, 512, 512, True, 0.7, 1.5, 'R-ESRGAN 4x+ Anime6B', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 30, 'Euler a', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\txt2img.py", line 109, in txt2img
        processed = processing.process_images(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 847, in process_images
        res = process_images_inner(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1075, in process_images_inner        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1393, in sample
        self.sampler = sd_samplers.create_sampler(self.sampler_name, self.sd_model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers.py", line 41, in create_sampler
        sampler = config.constructor(model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 31, in <lambda>
        sd_samplers_common.SamplerData(label, lambda model, funcname=funcname: KDiffusionSampler(funcname, model), aliases, options)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 72, in __init__
        self.model_wrap = self.model_wrap_cfg.inner_model
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 57, in inner_model
        self.model_wrap = denoiser(shared.sd_model, quantize=shared.opts.enable_quantization)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 135, in __init__
        super().__init__(model, model.alphas_cumprod, quantize=quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 92, in __init__
        super().__init__(((1 - alphas_cumprod) / alphas_cumprod) ** 0.5, quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 40, in wrapped
        return f(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 966, in __rsub__
        return _C._VariableFunctions.rsub(self, other)
    RuntimeError: CUDA error: invalid argument
    CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

---
Traceback (most recent call last):
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 95, in f
    mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()}
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 99, in stop
    return self.read()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 81, in read
    torch_stats = torch.cuda.memory_stats(self.device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 258, in memory_stats
    stats = memory_stats_as_nested_dict(device=device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 270, in memory_stats_as_nested_dict
    return torch._C._cuda_memoryStats(device)
RuntimeError: invalid argument to memory_allocated
Restarting UI...
Closing server running on port: 7860
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 1.0s (load scripts: 0.5s, create ui: 0.4s, gradio launch: 0.1s).
*** Error completing request
*** Arguments: ('task(85bg9k6yvsqfio9)', <gradio.routes.Request object at 0x000001BCFF533610>, 'black cat', '', [], 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 20, 'DPM++ 2M', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\txt2img.py", line 109, in txt2img
        processed = processing.process_images(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 847, in process_images
        res = process_images_inner(p)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1075, in process_images_inner        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\processing.py", line 1393, in sample
        self.sampler = sd_samplers.create_sampler(self.sampler_name, self.sd_model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers.py", line 41, in create_sampler
        sampler = config.constructor(model)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 31, in <lambda>
        sd_samplers_common.SamplerData(label, lambda model, funcname=funcname: KDiffusionSampler(funcname, model), aliases, options)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 72, in __init__
        self.model_wrap = self.model_wrap_cfg.inner_model
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 57, in inner_model
        self.model_wrap = denoiser(shared.sd_model, quantize=shared.opts.enable_quantization)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 135, in __init__
        super().__init__(model, model.alphas_cumprod, quantize=quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\repositories\k-diffusion\k_diffusion\external.py", line 92, in __init__
        super().__init__(((1 - alphas_cumprod) / alphas_cumprod) ** 0.5, quantize)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 40, in wrapped
        return f(*args, **kwargs)
      File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\_tensor.py", line 966, in __rsub__
        return _C._VariableFunctions.rsub(self, other)
    RuntimeError: CUDA error: invalid argument
    CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

---
Traceback (most recent call last):
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\call_queue.py", line 95, in f
    mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()}
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 99, in stop
    return self.read()
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\modules\memmon.py", line 81, in read
    torch_stats = torch.cuda.memory_stats(self.device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 258, in memory_stats
    stats = memory_stats_as_nested_dict(device=device)
  File "E:\SD2.0\SD\webui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\cuda\memory.py", line 270, in memory_stats_as_nested_dict
    return torch._C._cuda_memoryStats(device)
RuntimeError: invalid argument to memory_allocated

Additional information

I have a 6600XT (8GB) No extensions

CS1o commented 4 months ago

Hey, your missing the steps where you have to replace the rocm library files with custom ones. Follow my AMD Automatic1111 with Zluda install guide from here: https://github.com/CS1o/Stable-Diffusion-Info/wiki/Installation-Guides