Open erixwd opened 8 months ago
--skip-torch-cuda-test
with --use-directml
Now PC crashes on generation
venv "R:\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
fatal: No names found, cannot describe anything.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: 1.8.0-RC
Commit hash: ce3d044680100834e862bee6fefb413bfc835ece
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
R:\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
rank_zero_deprecation(
Launching Web UI with arguments: --opt-sub-quad-attention --medvram --disable-nan-check --no-half --use-directml
ONNX: selected=DmlExecutionProvider, available=['DmlExecutionProvider', 'CPUExecutionProvider']
==============================================================================
You are running torch 2.0.0+cpu.
The program is tested to work with torch 2.1.2.
To reinstall the desired version, run with commandline flag --reinstall-torch.
Beware that this will cause a lot of large files to be downloaded, as well as
there are reports of issues with training tab on the latest version.
Use --skip-version-check commandline argument to disable this check.
==============================================================================
Loading weights [ef76aa2332] from R:\stable-diffusion-webui-directml\models\Stable-diffusion\realisticVisionV60B1_v51VAE.safetensors
Creating model from config: R:\stable-diffusion-webui-directml\configs\v1-inference.yaml
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Startup time: 29.2s (initial startup: 0.2s, prepare environment: 72.5s, initialize shared: 12.7s, other imports: 0.3s, setup gfpgan: 0.2s, load scripts: 7.6s, reload hypernetworks: 0.1s, initialize extra networks: 0.2s, scripts before_ui_callback: 0.2s, create ui: 4.0s, gradio launch: 1.7s).
change you argument: --opt-sub-quad-attention --medvram --disable-nan-check --no-half --use-directml --reinstall-torch
after torch reinstall close SD and remove that "reinstall-torch" from startup argument.
change you argument: --opt-sub-quad-attention --medvram --disable-nan-check --no-half --use-directml --reinstall-torch
after torch reinstall close SD and remove that "reinstall-torch" from startup argument.
I already did. When running --reinstall-torch
the error pops up again (in the same instance) like nothing happened.
When running this webui-user:
@echo off
pip uninstall torch
pip install torch-directml
set PYTHON= C:\Users\Erik\AppData\Local\Programs\Python\Python310\python.exe
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS= --opt-sub-quad-attention --medvram --disable-nan-check --no-half --skip-torch-cuda-test --use-directml
call webui.bat
I get this message back:
WARNING: Skipping torch as it is not installed.
ERROR: Could not find a version that satisfies the requirement torch-directml (from versions: none)
ERROR: No matching distribution found for torch-directml
venv "R:\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
fatal: No names found, cannot describe anything.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: 1.8.0-RC
Commit hash: ce3d044680100834e862bee6fefb413bfc835ece
It basically forces me to use torch 2.0.0+cpu.
@erixwd
you do not need --skip-torch-cuda-test when installation done correctly and using use-directml argument.
Remove --skip-torch-cuda-test from COMMANDLINE_ARGS.
Add --reinstall-torch argument to COMMANDLINE_ARGS.
Remove that "pip install torch" and "pip uninstall torch-directml" from start of the file because that --reinstall-torch will call pip install and if you using wrong torch depending other startup arguments it will install correct torch-version Usually its says something like "using xx.xxx.xxx torch but requirements invalid, uninstalling torch". After that it will install correct torch versio for you.
after torch reinstall and SD starts remove reinstall argument.
Try this (example):
git pull
echo off
set PYTHON=C:\Users\Erik\AppData\Local\Programs\Python\Python310\python.exe
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--use-directml --disable-nan-check --opt-sdp-attention --opt-sub-quad-attention --opt-split-attention -no-half --no-half-vae --precision full --medvram --medvram-sdxl --reinstall-torch
call webui.bat
you also using old AMD card so precision full and no-half + no-half-vae neccessary if you are using upscalers. These arguments usually speed up your generation little.
if you still face some error text try to open CMD from stable-diffusion-directml folder (right click inside a folder "run control panel" or type CMD directly to file direction field. After CMD opens type:
.\venv\scripts\activate && pip install -r requirements.txt
after requirements are installed or there was nothing to install and still SD not open. Copy/paste error here again.
Re
@erixwd
you do not need --skip-torch-cuda-test when installation done correctly and using use-directml argument.
Remove --skip-torch-cuda-test from COMMANDLINE_ARGS.
Add --reinstall-torch argument to COMMANDLINE_ARGS.
Remove that "pip install torch" and "pip uninstall torch-directml" from start of the file because that --reinstall-torch will call pip install and if you using wrong torch depending other startup arguments it will install correct torch-version Usually its says something like "using xx.xxx.xxx torch but requirements invalid, uninstalling torch". After that it will install correct torch versio for you.
after torch reinstall and SD starts remove reinstall argument.
Try this (example):
git pull echo off set PYTHON=C:\Users\Erik\AppData\Local\Programs\Python\Python310\python.exe set GIT= set VENV_DIR= set COMMANDLINE_ARGS=--use-directml --disable-nan-check --opt-sdp-attention --opt-sub-quad-attention --opt-split-attention -no-half --no-half-vae --precision full --medvram --medvram-sdxl --reinstall-torch call webui.bat
you also using old AMD card so precision full and no-half + no-half-vae neccessary if you are using upscalers. These arguments usually speed up your generation little.
if you still face some error text try to open CMD from stable-diffusion-directml folder (right click inside a folder "run control panel" or type CMD directly to file direction field. After CMD opens type:
.\venv\scripts\activate && pip install -r requirements.txt
after requirements are installed or there was nothing to install and still SD not open. Copy/paste error here again.
@Shefu88 Thank you so much. 🙏🏼
I just realized I forgot --skip-torch-cuda-test
in there. As soon as I've removed everything worked and went from 20min to 20 sec.
(Thanks for the tips about torch installer)
@Shefu88 It broke again.
Was doing good, about 30sec. per generation while running: --opt-sub-quad-attention --medvram --disable-nan-check --no-half --use-directml
,
So to try I've put in the arguments you suggested, never worked again since.
Tried running the old argumens, deleteting the /venv folder, re-did --reinstall-torch
(that didn't solve the issiue the first time), ran .\venv\scripts\activate && pip install -r requirements.txt
in a cmd window in the /stable-diffusion-webui-directml folder... It just won't come back to life. I higly doubt doing a clean install will solve it, so I didn't bother.
- This session below was recorded before the system crashed at generation:
Webui-user.bat:
@echo off
git pull
set PYTHON= C:\Users\Erik\AppData\Local\Programs\Python\Python310\python.exe
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS= --opt-sub-quad-attention --medvram --disable-nan-check --no-half --use-directml --reinstall-torch (I removed it)
call webui.bat
Running instance:
Already up to date.
venv "R:\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
fatal: No names found, cannot describe anything.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: 1.8.0-RC
Commit hash: 7071a4a7e42e7d5588f7c3eee44d411953419f8d
Installing torch and torchvision
Requirement already satisfied: torch==2.0.0 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (2.0.0)
Requirement already satisfied: torchvision==0.15.1 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (0.15.1)
Requirement already satisfied: torch-directml in r:\stable-diffusion-webui-directml\venv\lib\site-packages (0.2.0.dev230426)
Requirement already satisfied: typing-extensions in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.0.0) (4.10.0)
Requirement already satisfied: filelock in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.0.0) (3.13.3)
Requirement already satisfied: sympy in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.0.0) (1.12)
Requirement already satisfied: jinja2 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.0.0) (3.1.3)
Requirement already satisfied: networkx in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torch==2.0.0) (3.2.1)
Requirement already satisfied: numpy in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torchvision==0.15.1) (1.26.2)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torchvision==0.15.1) (9.5.0)
Requirement already satisfied: requests in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from torchvision==0.15.1) (2.31.0)
Requirement already satisfied: MarkupSafe>=2.0 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from jinja2->torch==2.0.0) (2.1.5)
Requirement already satisfied: urllib3<3,>=1.21.1 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.15.1) (2.2.1)
Requirement already satisfied: idna<4,>=2.5 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.15.1) (3.6)
Requirement already satisfied: certifi>=2017.4.17 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.15.1) (2024.2.2)
Requirement already satisfied: charset-normalizer<4,>=2 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from requests->torchvision==0.15.1) (3.3.2)
Requirement already satisfied: mpmath>=0.19 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from sympy->torch==2.0.0) (1.3.0)
[notice] A new release of pip available: 22.2.1 -> 24.0
[notice] To update, run: R:\stable-diffusion-webui-directml\venv\Scripts\python.exe -m pip install --upgrade pip
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
R:\stable-diffusion-webui-directml\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
rank_zero_deprecation(
Collecting onnxruntime-gpu
Using cached onnxruntime_gpu-1.17.1-cp310-cp310-win_amd64.whl (148.6 MB)
Requirement already satisfied: coloredlogs in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from onnxruntime-gpu) (15.0.1)
Requirement already satisfied: protobuf in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from onnxruntime-gpu) (3.20.3)
Requirement already satisfied: numpy>=1.21.6 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from onnxruntime-gpu) (1.26.2)
Requirement already satisfied: packaging in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from onnxruntime-gpu) (24.0)
Requirement already satisfied: sympy in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from onnxruntime-gpu) (1.12)
Requirement already satisfied: flatbuffers in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from onnxruntime-gpu) (24.3.25)
Requirement already satisfied: humanfriendly>=9.1 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from coloredlogs->onnxruntime-gpu) (10.0)
Requirement already satisfied: mpmath>=0.19 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from sympy->onnxruntime-gpu) (1.3.0)
Requirement already satisfied: pyreadline3 in r:\stable-diffusion-webui-directml\venv\lib\site-packages (from humanfriendly>=9.1->coloredlogs->onnxruntime-gpu) (3.4.1)
Installing collected packages: onnxruntime-gpu
Error: The 'onnxruntime-gpu' distribution was not found and is required by the application
+---------------------------------+
--- PLEASE, RESTART the Server! ---
+---------------------------------+
Launching Web UI with arguments: --opt-sub-quad-attention --medvram --disable-nan-check --no-half --use-directml --reinstall-torch
ONNX: selected=DmlExecutionProvider, available=['DmlExecutionProvider', 'CPUExecutionProvider']
==============================================================================
You are running torch 2.0.0+cpu.
The program is tested to work with torch 2.1.2.
To reinstall the desired version, run with commandline flag --reinstall-torch.
Beware that this will cause a lot of large files to be downloaded, as well as
there are reports of issues with training tab on the latest version.
Use --skip-version-check commandline argument to disable this check.
==============================================================================
21:09:01 - ReActor - STATUS - Running v0.7.0-b7 on Device: CPU
Loading weights [ef76aa2332] from R:\stable-diffusion-webui-directml\models\Stable-diffusion\realisticVisionV60B1_v51VAE.safetensors
Creating model from config: R:\stable-diffusion-webui-directml\configs\v1-inference.yaml
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Startup time: 34.6s (prepare environment: 36.2s, initialize shared: 3.7s, other imports: 0.2s, load scripts: 7.3s, create ui: 2.0s, gradio launch: 0.9s).
I'm running out of ideas.
Hmmm,
So it worked first but after restart it didn't?
This is just a wild guess but try this:
Open cmd inside stable-diffusion-directml folder. Type: .\venv\scripts\activate. Inside venv type: Python -m pip install --upgrade pip
After pip is upgraded to ver 24 type:
Webui.bat --use-directml --reinstall-torch.
You can also try to delete whole venv folder and start webui-user.bat by using only --use-directml and --reinstall-torch. Do not use other arguments. Try that.
If it is booting addone by one those performance arguments so you can keep up what is causing startup errors.
Ok running:
.\venv\scripts\activate cd venv Python -m pip install --upgrade pip
surely made some progress. 👍🏼
Hmmm,
So it worked first but after restart it didn't?
This is just a wild guess but try this:
Open cmd inside stable-diffusion-directml folder. Type: .\venv\scripts\activate. Inside venv type: Python -m pip install --upgrade pip
After pip is upgraded to ver 24 type:
Webui.bat --use-directml --reinstall-torch.
You can also try to delete whole venv folder and start webui-user.bat by using only --use-directml and --reinstall-torch. Do not use other arguments. Try that.
If it is booting addone by one those performance arguments so you can keep up what is causing startup errors.
Now an old issue popped out again that I think @lshqqytiger should be made aware of.
What solved the problem was removing --no-half
from the arguments. This is an old issue I had that was solved in an update, as it just started working again somewhere last year. Now after the trial and error to solve the CPU deal unfortunately it became an issue again.
I wouldn't know how to debug it as the argument just crashes everything, so I wouldn't know how to pull a crash report from it.
It's not that big of a deal but kinda is, as it is required to run Inpainting. I can live without it but for others it's certainly a feature that's missing.
Try to use
--precision full --no-half --medvram --no-half-vae --use-directml
Is there any difference?
Btw. How much RAM you have in your PC. I was thinking is this even a GPU/SD releated problem.
Unfortunately not. System really doesn't like having --no-half
in there. This is an issue that haunted me for months and I just adjusted to with time.
I'm running 16 gigs ddr3 (It's quite an old machine but still works fine).
Is there any progress @erixwd with this issue?
If not, check these:
You do not have any other than 3.10.6 python installed in your computer. If you have other versions: Delete all versions and their root folders from your system. After that reboot and check you do not have any old python location in PATH. Reinstall python 3.10.6.
If you don't have other python version on your PC: Check that python is installed to PATH.
Check from device manager that your external GPU have no issues. It should say "your device is working correctly. No issues found" Sometimes wrong or faulty drivers show up there with an error.
Double check you have newest and correct AMD drivers to your system and GPU. Even device manager says that there is no issues.
After these checks.
Try to start SD from terminal using only --use-directml argument.
Like this:
webui.bat --use-directml
If you can boot to UI stage post screenshot or post information from footer section(bottom of your webui where is torchversions and etc).
Before trying to generate something USE only SD1.5 base model. Do not use that "realisticVisionV60B1_v51VAE.safetensors" - model. Use only Euler sampler to test generation.
If you can't boot up to webUI and terminal says you do not have proper GPU.
Start webui with this command
webui.bat --use-directml --reinstall-torch
Answers needed:
Can you boot up to webUI using only --use-directml argument or --use-directml --reinstall-torch? If not what is the error message?
If you can boot up to UI, can you run full generation with 20 steps without any crash with SD base model?
When you face crash/error, what is the last message from terminal except "press any key to terminate".
@erixwd Also using AMD RX580 8GB. The only way I got it working is by deleting venv folder, add these args to webui-user.bat: --use-directml --opt-sub-quad-attention --disable-nan-check --opt-split-attention, then run webui-user.bat
The RX580 works great using Euler a sampler. Would recommend do not upscale too much - crashes drivers. Use 512x512 by sampler and don't upscale past 930x930.
Do not update to V1.9.0 at this moment. Not sure if it does not like the rx580 but it just breaks everything. Stay on V1.8
Edit: Also I highly recommend undervolting your card , temps get insane if you don't.
@erixwd I have an RX 580 as well. I tried to run through Zluda and Directml but it never works. But I was able to run with Directml, you need to reinstall Python 10.6 and add it to the PATH (select the checkbox when installing). Install AMD HIP SDK and restart the PC. Next, download Stable Diffusion: git clone https://github.com/lshqqytiger/stable-diffusion-webui-directml. Paste it to the path C:\Users\your user. Before launching it (EXACTLY BEFORE), change the arguments in “webui user.bat” to set “COMMANDLINE_ARGS= --use-directml --opt-sub-quad-attention --no-half --disable-nan-check --autolaunch”. Only then run “webui user.bat”.
@erixwdTambién utilizo AMD RX580 de 8 GB. La única forma de que funcione es eliminando la carpeta venv, agregando estos argumentos a webui-user.bat: --use-directml --opt-sub-quad-attention --disable-nan-check --opt-split-attention y luego ejecutando webui-user.bat
El RX580 funciona muy bien con el sampler Euler. Recomendaría no aumentar demasiado la escala, ya que bloquea los controladores. Utilice 512x512 con el sampler y no aumente la escala por encima de 930x930.
No actualice a la versión 1.9.0 por el momento. No estoy seguro de si no le gusta la rx580, pero simplemente lo estropea todo. Siga con la versión 1.8.
Editar: También recomiendo encarecidamente subvoltear tu tarjeta, las temperaturas se vuelven locas si no lo haces.
Hello, I have an rx580 and I am having this same problem, do you know how I can get the version 1.8 that you say, I don't know much about these things, how did you manage to fix the problem?
Checklist
What happened?
It's been months since I started diving into this problem and it's driving me crazy. Up until some version (it's been a while I'm not sure whitch one) my GPU just kissed goodbye to processing and I cannot figure out how to resolve this issiue.
I tried basically everything in my basic knowledge of compatibility issiues: drivers both PRO and Adrenalin, every version of python and torch-directml, every version of onnx-directml but it still doesn't give any sign of life.
It takes more than 20 minutes for a 512x786 on my poor i5 4460 so I really would like to get to the other side of this.
Steps to reproduce the problem
What should have happened?
GPU should accept processing of the image.
What browsers do you use to access the UI ?
Google Chrome
Sysinfo
sysinfo-2024-03-30-22-29.json
Console logs
Additional information
I'm running on a Msi RX580 8gb Armor with Adrenalin 24.3.1 up to date | Windows 10 | Chrome Browser | No extensions