lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.67k stars 175 forks source link

[Bug]: "AttributeError: module 'torch' has no attribute 'dml'" with --use-directml passed, and torch-directml installed. 6800XT on Nobaro 39 #364

Closed bzarky closed 4 months ago

bzarky commented 5 months ago

Checklist

What happened?

Fresh install of stable-diffusion-webui-directml, using pyenv to set env to 3.10.13. requirements_versions.txt edited to include torch-directml torch-directml appears to install successfully webui.sh launched with --use-directml

Error seen below. As far as I can surmise, i am doing everything correctly, according to the pinned issue with the same error, but the fixes there do not seem to apply here.

I was able to get it to work once yesterday, but i do not have any logs to prove it and i'm not sure how it worked, but after closing and launching it again, the error came back and it would not launch. So lets just go forward with the information i have.

Steps to reproduce the problem

  1. make new test directory
  2. run: git clone https://github.com/lshqqytiger/stable-diffusion-webui-directml && cd stable-diffusion-webui-directml
  3. edit requirements_versions.txt to include torch-directml
  4. run sh webui.sh
  5. Installation proceeds and webui launches with error RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
  6. run sh webui.sh --no-half
  7. webui launches with error RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
  8. run 'sh webui.sh --use-directml'
  9. webui launches with error AttributeError: module 'torch' has no attribute 'dml'

What should have happened?

Webui should run and detect that torch-directml is installed

What browsers do you use to access the UI ?

Mozilla Firefox

Sysinfo

If I try to poll sysinfo via the webui (With --skip-torch-cuda-test to let it launch), it crashes with AttributeError: module 'torch' has no attribute 'dml' even though --use-directml was not passed at launch.

running sh webui.sh --dump-sysinfo provides the same outcome.

Console logs

################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc_minimal.so.4
fatal: No names found, cannot describe anything.
Python 3.10.13 (main, Jan 23 2024, 18:09:57) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)]
Version: 1.7.0
Commit hash: d500e58a65d99bfaa9c7bb0da6c3eb5704fadf25
Launching Web UI with arguments: --use-directml
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Traceback (most recent call last):
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/launch.py", line 48, in <module>
    main()
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/launch.py", line 44, in main
    start()
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/modules/launch_utils.py", line 687, in start
    import webui
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/webui.py", line 13, in <module>
    initialize.imports()
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/modules/initialize.py", line 34, in imports
    shared_init.initialize()
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/modules/shared_init.py", line 26, in initialize
    dml.do_hijack()
  File "/home/zarky/test-stablediffusion/stable-diffusion-webui-directml/modules/dml/__init__.py", line 74, in do_hijack
    if not torch.dml.has_float64_support(device):
AttributeError: module 'torch' has no attribute 'dml'

Additional information

I've been trying over and over to get this to start reliably without luck. I'm really not sure what else to do. I'm hoping someone else can spot something I'm missing or doing wrong. I'll provide any more info necessary.

bzarky commented 5 months ago

I got it working again, I had to install some HIP ROCM packages from nobara ROCM. no idea how i figured that out.

lshqqytiger commented 5 months ago

You should not add --use-directml because DirectML does not support Linux environment excepting WSL.

bzarky commented 5 months ago

I was not aware of that. All the info i found said i had to use it, but i guess i figured out the hard way that it was not necessary. Thanks.

lshqqytiger commented 5 months ago

You have to use it when you are on Windows. But you are not on Windows and able to try ROCm. So --use-directml is useless and won't work.

bzarky commented 5 months ago

Got it, Thank you. This could be good information to keep around for people unfamiliar with it like myself.

Likqez commented 1 month ago

What did u install? Facing the same issue