[Bug]: --system-site-packages not working AMD Arch Linux

Vektor8298 commented 4 months ago

Checklist

[X] The issue exists after disabling all extensions
[X] The issue exists on a clean installation of webui
[ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
[X] The issue exists in the current version of the webui
[X] The issue has not been reported before recently
[ ] The issue has been reported before but has not been fixed yet

What happened?

After following the Wikia's AMD install, installing pytorch-opt-rocm and python-torchvision-rocm with their dependencies, creating a venv with pyenv, version 3.10.6 with --system-site-packages, installing requirements and such, webui refuses to run, complaining about no CUDA support.

Steps to reproduce the problem

Create a venv with --system-site-packages
Install requirements
Run webui.sh
Observe bug

What should have happened?

WebUI should run with --system-site-packages using locally installed pytorch-opt-rocm and python-torchvision-rocm

What browsers do you use to access the UI ?

No response

Sysinfo

sysinfo-2024-02-22-01-24.json

Console logs

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################

################################################################
Running on vektor user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc_minimal.so.4
Python 3.10.6 (main, Feb  9 2024, 20:12:38) [GCC 13.2.1 20230801]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Traceback (most recent call last):
  File "/mnt/ts512/stable-diffusion-webui/launch.py", line 48, in <module>
    main()
  File "/mnt/ts512/stable-diffusion-webui/launch.py", line 39, in main
    prepare_environment()
  File "/mnt/ts512/stable-diffusion-webui/modules/launch_utils.py", line 384, in prepare_environment
    raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

Additional information

❯ pip freeze absl-py==2.1.0 accelerate==0.27.2 addict==2.4.0 aenum==3.1.15 aiofiles==23.2.1 aiohttp==3.9.3 aiosignal==1.3.1 altair==5.2.0 annotated-types==0.6.0 antlr4-python3-runtime==4.9.3 anyio==4.3.0 async-timeout==4.0.3 attrs==23.2.0 basicsr==1.4.2 blendmodes==2024.1 certifi==2024.2.2 charset-normalizer==3.3.2 clean-fid==0.1.35 click==8.1.7 contourpy==1.2.0 cycler==0.12.1 einops==0.7.0 exceptiongroup==1.2.0 facexlib==0.3.0 fastapi==0.109.2 ffmpy==0.3.2 filelock==3.13.1 filterpy==1.4.5 fonttools==4.49.0 frozenlist==1.4.1 fsspec==2024.2.0 ftfy==6.1.3 future==1.0.0 gfpgan==1.3.8 gitdb==4.0.11 GitPython==3.1.42 gradio==3.41.2 gradio_client==0.5.0 grpcio==1.62.0 h11==0.14.0 httpcore==1.0.4 httpx==0.27.0 huggingface-hub==0.20.3 idna==3.6 imageio==2.34.0 importlib-metadata==7.0.1 importlib-resources==6.1.1 inflection==0.5.1 Jinja2==3.1.3 jsonmerge==1.9.2 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 kiwisolver==1.4.5 kornia==0.7.1 lark==1.1.9 lazy_loader==0.3 lightning-utilities==0.10.1 llvmlite==0.42.0 lmdb==1.4.1 Markdown==3.5.2 MarkupSafe==2.1.5 matplotlib==3.8.3 mpmath==1.3.0 multidict==6.0.5 networkx==3.2.1 numba==0.59.0 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.19.3 nvidia-nvjitlink-cu12==12.3.101 nvidia-nvtx-cu12==12.1.105 omegaconf==2.3.0 open-clip-torch==2.24.0 opencv-python==4.9.0.80 orjson==3.9.14 packaging==23.2 pandas==2.2.0 piexif==1.1.3 pillow==10.2.0 platformdirs==4.2.0 protobuf==4.25.3 psutil==5.9.8 pydantic==2.6.1 pydantic_core==2.16.2 pydub==0.25.1 pyparsing==3.1.1 python-dateutil==2.8.2 python-multipart==0.0.9 pytorch-lightning==2.2.0.post0 pytz==2024.1 PyYAML==6.0.1 realesrgan==0.3.0 referencing==0.33.0 regex==2023.12.25 requests==2.31.0 resize-right==0.0.2 rpds-py==0.18.0 safetensors==0.4.2 scikit-image==0.22.0 scipy==1.12.0 semantic-version==2.10.0 sentencepiece==0.2.0 six==1.16.0 smmap==5.0.1 sniffio==1.3.0 starlette==0.36.3 sympy==1.12 tb-nightly==2.17.0a20240221 tensorboard-data-server==0.7.2 tifffile==2024.2.12 timm==0.9.16 tokenizers==0.13.3 tomesd==0.1.3 tomli==2.0.1 toolz==0.12.1 torch==2.2.0 torchdiffeq==0.2.3 torchmetrics==1.3.1 torchsde==0.2.6 torchvision==0.17.0 tqdm==4.66.2 trampoline==0.1.2 transformers==4.30.2 triton==2.2.0 typing_extensions==4.9.0 tzdata==2024.1 urllib3==2.2.1 uvicorn==0.27.1 wcwidth==0.2.13 websockets==11.0.3 Werkzeug==3.0.1 yapf==0.40.2 yarl==1.9.4 zipp==3.17.0

gangstead commented 4 months ago

Looks like you are using Cuda 12, I could only get it to work after downgrading to 11.8. If there's a place that says the official supported version of Cuda I'd like to hear it.

Vektor8298 commented 4 months ago

I will try, thanks!

Vektor8298 commented 4 months ago

No such luck. Do I need nvidia packages on AMD?

Vektor8298 commented 4 months ago

I got it to "work" with not using pyenv. But now I have this error: loading stable diffusion model: RuntimeError Traceback (most recent call last): File "/mnt/ts512/stable-diffusion-webui/launch.py", line 48, in <module> main() File "/mnt/ts512/stable-diffusion-webui/launch.py", line 44, in main start() File "/mnt/ts512/stable-diffusion-webui/modules/launch_utils.py", line 464, in start webui.webui() File "/mnt/ts512/stable-diffusion-webui/webui.py", line 52, in webui initialize.initialize() File "/mnt/ts512/stable-diffusion-webui/modules/initialize.py", line 75, in initialize initialize_rest(reload_script_modules=False) File "/mnt/ts512/stable-diffusion-webui/modules/initialize.py", line 111, in initialize_rest scripts.load_scripts() File "/mnt/ts512/stable-diffusion-webui/modules/scripts.py", line 469, in load_scripts script_module = script_loading.load_module(scriptfile.path) File "/mnt/ts512/stable-diffusion-webui/modules/script_loading.py", line 10, in load_module module_spec.loader.exec_module(module) File "<frozen importlib._bootstrap_external>", line 940, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/scripts/main.py", line 10, in <module> from library.modules.token_mixer import TokenMixer File "<frozen importlib._bootstrap>", line 1176, in _find_and_load File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 690, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 940, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/modules/token_mixer.py", line 13, in <module> from library.data import dataStorage File "<frozen importlib._bootstrap>", line 1176, in _find_and_load File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 690, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 940, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/data.py", line 347, in <module> dataStorage = Data() #Create data File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/data.py", line 281, in __init__ try: sd_hijack.model_hijack.embedding_db.load_textual_inversion_embeddings(force_reload=True) File "/mnt/ts512/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 222, in load_textual_inversion_embeddings self.expected_shape = self.get_expected_shape() File "/mnt/ts512/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 154, in get_expected_shape vec = shared.sd_model.cond_stage_model.encode_embedding_init_text(",", 1) File "/mnt/ts512/stable-diffusion-webui/modules/shared_items.py", line 128, in sd_model return modules.sd_models.model_data.get_sd_model() File "/mnt/ts512/stable-diffusion-webui/modules/sd_models.py", line 531, in get_sd_model load_model() File "/mnt/ts512/stable-diffusion-webui/modules/sd_models.py", line 681, in load_model sd_model.cond_stage_model_empty_prompt = get_empty_cond(sd_model) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/sd_models.py", line 566, in get_empty_cond d = sd_model.get_learned_conditioning([""]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/sd_models_xl.py", line 31, in get_learned_conditioning c = self.conditioner(sdxl_conds, force_zero_embeddings=['txt'] if force_zero_negative_prompt else []) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl result = forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/repositories/generative-models/sgm/modules/encoders/modules.py", line 141, in forward emb_out = embedder(batch[embedder.input_key]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/sd_hijack_clip.py", line 234, in forward z = self.process_tokens(tokens, multipliers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/sd_hijack_clip.py", line 273, in process_tokens z = self.encode_with_transformers(tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/sd_hijack_clip.py", line 349, in encode_with_transformers outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=self.wrapped.layer == "hidden") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/venv/lib/python3.11/site-packages/transformers/models/clip/modeling_clip.py", line 822, in forward return self.text_model( ^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/venv/lib/python3.11/site-packages/transformers/models/clip/modeling_clip.py", line 730, in forward hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/venv/lib/python3.11/site-packages/transformers/models/clip/modeling_clip.py", line 227, in forward inputs_embeds = self.token_embedding(input_ids) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/sd_hijack.py", line 348, in forward inputs_embeds = self.wrapped(input_ids) ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/modules/sparse.py", line 163, in forward return F.embedding( ^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/torch/nn/functional.py", line 2237, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: HIP error: shared object initialization failed HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. Compile withTORCH_USE_HIP_DSA` to enable device-side assertions.

Stable diffusion model failed to load *** Error loading script: main.py Traceback (most recent call last): File "/mnt/ts512/stable-diffusion-webui/modules/scripts.py", line 469, in load_scripts script_module = script_loading.load_module(scriptfile.path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/modules/script_loading.py", line 10, in load_module module_spec.loader.exec_module(module) File "", line 940, in exec_module File "", line 241, in _call_with_frames_removed File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/scripts/main.py", line 10, in from library.modules.token_mixer import TokenMixer File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/modules/token_mixer.py", line 13, in from library.data import dataStorage File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/data.py", line 347, in dataStorage = Data() #Create data ^^^^^^ File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/data.py", line 291, in init is_sdxl , is_sd2 , is_sd1 = self.tools.get_flags() ^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/ts512/stable-diffusion-webui/extensions/TokenMixer/library/toolbox/tools.py", line 68, in get_flags assert self.model_is_loaded(), "Model is not loaded" AssertionError: Model is not loaded`

matoro commented 4 months ago

I also ran into this and I think here's what's happening - torch is listed in requirements.txt, so when it gets installed, it pulls in the generic one without ROCM support. I also tried removing torch from requirements.txt but it got pulled in anyway as a dependency of something else. The issue is that --system-site-packages tells venv to allow using system packages, but it doesn't prioritize them, and if a local pip install requests the same package it will happily install a local version that takes priority.

Minimal example:

$ python3 -c 'import torch; print(torch.cuda.is_available())'
True
$ python3 -m venv --system-site-packages venv
$ source venv/bin/activate
(venv) $ pip install --quiet -r requirements.txt

[notice] A new release of pip is available: 23.2.1 -> 24.0
[notice] To update, run: pip install --upgrade pip
(venv) $ python3 -c 'import torch; print(torch.cuda.is_available())'
False

matoro commented 4 months ago

I worked around this by editing requirements.txt to specify the same torch version as the one installed by my system package manager. If I left it as just torch, it would install the generic version of 2.2.1. I have 2.2.0 from the Arch repository, so I changed it to specify torch==2.2.0, and now it does not override my system version.

Krisseck commented 2 months ago

I'm having similar issue, where as running ./webui.sh causes torch.cuda.is_available() to be False, but running python launch.py it is True.

AUTOMATIC1111 / stable-diffusion-webui