Closed MightyPork closed 1 year ago
By itself it says nVidia CUDA toolkit detected despite there being no nVidia and no CUDA packages (but I used to have an nVidia card).
you still have old nvidia utilities installed, for example nvidia-smi
I tried to force it using flags, sometimes there are random errors, xformers is removed and other times installed again,
if xformers are not selected as desired cross-attention method, they will be uninstalled. i wrote the reason for that in the latest update notes.
I added rembg and xformers to requirements.txt, thinking that will help
don't. if you want to force specific xformers, there are correct ways of doing that. editing requirements.txt is not the way. and rembg is handled automatically.
Everything looks happy, but the GPU is not detected.
its not, you can see that torch with cuda is installed instead of torch for rocm. that's because on initial install, it detected nvidia and you only added --use-rocm
later. but installer will not force-change torch once its installed, you need to use --reinstall
.
so to summarize:
webui.sh --use-rocm --reinstall
Thanks for the assistance, I purged everything nvidia, then had to change the TORCH command to:
torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/rocm5.4.2
now I'm at OutOfMemoryError: HIP out of memory. Tried to allocate 20.00 MiB (GPU 0; 3.98 GiB total capacity; 3.83 GiB already allocated; 66.00 MiB free; 3.92 GiB reserved in total by PyTorch)
.
I found advice to set export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512
, but that didn't help at all :(
It's unable to load any model, even the basic SD one. Easy Diffusion loads the exact same model fine, so it must be some wrong settings for pytorch (?) here.
This is a log from ED in case there's any hints what to do - "VRAM Optimizations" looks interesting, but i didn't find what it really means or does.
DiffusionWrapper has 859.52 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
01:55:42.079 INFO cuda:0 Created a temporary directory at /tmp/tmp3hkirdho instantiator.py:21
01:55:42.080 INFO cuda:0 Writing /tmp/tmp3hkirdho/_remote_module_non_scriptable.py instantiator.py:76
01:55:48.529 INFO cuda:0 VRAM Optimizations: {'KEEP_ENTIRE_MODEL_IN_CPU', 'SET_ATTENTION_STEP_TO_16'} optimizations.py:26
01:55:48.933 INFO cuda:0 Global seed set to 42 seed.py:65
Sampling: 0%| | 0/1 [00:00<?, ?it/s]01:55:50.396 INFO cuda:0 seeds used = [42] sampler_main.py:64
Data shape for PLMS sampling is (1, 4, 8, 8)
Running PLMS Sampling with 1 timesteps
The implementation for these optimizations seems to reside in sdkit/models/model_loader/stable_diffusion/optimizations.py
you have only 4GB, so you definitely need command line flag --medvram
or even --lowvram
i'm closing the issue as resolved as original issue is resolved, but feel free to post further question.
Yes, thanks for the tip. with medvram it didn't crash at startup, but initializing never moved past zero percent and kept eating all RAM until the system locked up.
lowvram has similar performance to what I previously saw with on this GPU
test render, took about 2 minutes, but it works.
Issue Description
I'm trying to get this tool working, after using Easy Diffusion for a while without problem - using
export HSA_OVERRIDE_GFX_VERSION=10.3.0
.By itself it says
nVidia CUDA toolkit detected
despite there being no nVidia and no CUDA packages (but I used to have an nVidia card).Everything rocm is installed.
I tried to force it using flags, sometimes there are random errors, xformers is removed and other times installed again, but either way, it never uses GPU acceleration.
I added rembg and xformers to requirements.txt, thinking that will help. rembg helped, there was a crash when it couldn't be found. xformers probably confused something.
now some things are uninstalled:
Everything looks happy, but the GPU is not detected.
rocminfo:
Some ideas what else to try?
Version Platform Description
arch, this tool freshly cloned and installed today
5f2bdba818d7307d98e46c70ef0dc185680b736b