Open Yumae opened 1 year ago
Can confirm, I'm having the same exact issue with my RX 6800 XT (16GB VRAM)
Same here, exactly the same issue. (RX 6700XT 12GB)
You can try the following commands:
sudo usermod -a -G video $USER
sudo usermod -a -G render $USER
Set the environment variable for SD:
PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128
GPU memory will be garbage collected when it reaches 60% capacity. and set the maximum size of memory splits to 128mb, that can help to reduce memory fragmentation.
You may also need to add the --medvram These worked for my RX 6750XT
Running the WebUI using --no-half and --lowvram solved it for me.
Can confirm the same with RX 6800S 8GB
Upgrading to pytorch 2.0 and rocm 5.4.2 fixed this for me. Also using --opt-sub-quad-attention really helps along side with --medvram and the PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128 All of these allows me to hi-res fix 512x768 to 1.85x (944x1420) on the RX 6600 8GB VRAM
I have the same problem on 5700XT using rocm 5.4.2 and pytorch 2.0 Strangely, it works fine using pytorch 1.13.1
same issue with both --medvram and --lowvram
With pytorch 2 i also tried to use --opt-sdp-attention with no effect
i also use --precision full and --no-half
i finally tried export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128, did not help.
I have the same problem on 5700XT using rocm 5.4.2 and pytorch 2.0 Strangely, it works fine using pytorch 1.13.1
same issue with both --medvram and --lowvram
With pytorch 2 i also tried to use --opt-sdp-attention with no effect
i also use --precision full and --no-half
i finally tried export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128, did not help.
I have a very similar setup on an Ubuntu machine. I downgraded to Pytorch 1.13.1 and everything appears to be fine except for a warning about missing database file
MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx1030_20.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package
@egolfbr That a harmless warning from AMD due to https://github.com/ROCmSoftwarePlatform/MIOpen/blob/develop/doc/src/cache.md
TLDR is that AMD ROCm will compile and cache some GPU stuff in the background, but also comes with pre-compiled GPU kernels for some cards. The version with pytorch 1.x does not seem to bundle a copy for your card, but the only effect should be that the first image you generate may be slow.
Had the same error while using a Lora model, but I was still using torch 1.12 Upgrading to 1.13.1 fixed it for me.
The following fixed an issue similar to OP
index 49a426ff..03b57253 100644
--- a/webui-user.sh
+++ b/webui-user.sh
@@ -10,7 +10,7 @@
#clone_dir="stable-diffusion-webui"
# Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
-#export COMMANDLINE_ARGS=""
+#export COMMANDLINE_ARGS="--reinstall-torch"
# python3 executable
#python_cmd="python3"
@@ -27,6 +27,9 @@
# install command for torch
#export TORCH_COMMAND="pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113"
+# https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8139
+export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.8,max_split_size_mb:128
+
# Requirements file to use for stable-diffusion-webui
#export REQS_FILE="requirements_versions.txt"
--- a/webui.sh
+++ b/webui.sh
@@ -119,7 +119,7 @@ esac
if echo "$gpu_info" | grep -q "AMD" && [[ -z "${TORCH_COMMAND}" ]]
then
# AMD users will still use torch 1.13 because 2.0 does not seem to work.
- export TORCH_COMMAND="pip install torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2 --index-url https://download.pytorch.org/whl/rocm5.2"
+ export TORCH_COMMAND="pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.4.2"
fi
for preq in "${GIT}" "${python_cmd}"
Arch, RX6800XT
Confirming the above has solved the 'Memory access fault by GPU node-1' problem on my machine.
However, while the above would work without a problem on a clean installation, I was forced to additionally use the --ignore-installed
flag on the pip install
command as follows.
TORCH_COMMAND="pip install --ignore-installed torch torchvision --index-url https://download.pytorch.org/whl/rocm5.4.2"
Manjaro, RX6900XT
Just wanted to add, for anyone finding this.
sudo usermod -a -G video $USER
sudo usermod -a -G render $USER
For some reason I got this error after adding my user to the groups video
and render
.
When removing the groups everything worked again.
sudo usermod -r -G video $USER
sudo usermod -r -G render $USER
i'm having the same problem with fooocus running on void linux with a 6950xt, pretty much tried every solution in this thread to no avail, but what seems to work as a workaround for me now is to use it with --always-no-vram
and --always-offload-from-vram
, not sure if A1111 has similar flags available but maybe worth a shot. it is a little bit slower compared to using vram, but it still easily beats running on cpu and atleast now i can leave it running to generate a bunch of images without it crashing every other image. if you have the extra system ram available it might be a good bandaid solution.
Is there an existing issue for this?
What happened?
When trying to generate pictures above a certain resolution i get this error in the console window. I have been able to consistently reproduce this by trying to generate a picture bigger than 768x1024/1024x768. Im sure that i could go higher than that with the amount of VRAM that this card has considering that the KDE resource monitor shows VRAM usage never reaching 7gb. In the screenshot it can be seen that the generation process goes to 100% but when it tries to output the image it spits out that error instead.
Steps to reproduce the problem
Generate a picture with a resolution higher than 1024x768 like for example 1280x768.
What should have happened?
It should output the picture and it should let me generate at higher resolutions as well.
Commit where the problem happens
3715ece0
What platforms do you use to access the UI ?
Linux
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
List of extensions
wildcards openpose-editor stable-diffusion-webui-dataset-tag-editor stable-diffusion-webui-images-browser stable-diffusion-webui-pixelization
Console logs
Additional information
Distro: EndeavourOS (ArchLinux) DE: KDE on X11 CPU: Ryzen 1600 GPU: RX 6600 (8GB VRAM) RAM: 16GB
WebUI installed with the default script. I didn't mess with ROCm versions or any of that since it took care of that automatically. Can generate pictures at or below 1024x768 with no problems. I get the same error both with and without highres fix enabled.