Open cklogic opened 1 month ago
Uninstalled and reinstalled torch, but still getting errors.
Installing collected packages: triton, torch, torchvision
Attempting uninstall: triton
Found existing installation: triton 2.2.0
Uninstalling triton-2.2.0:
Successfully uninstalled triton-2.2.0
Attempting uninstall: torchvision
Found existing installation: torchvision 0.17.1
Uninstalling torchvision-0.17.1:
Successfully uninstalled torchvision-0.17.1
Successfully installed torch-2.1.2+cu121 torchvision-0.16.2+cu121 triton-2.1.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Traceback (most recent call last):
File "/app/stable-diffusion-webui/launch.py", line 48, in <module>
main()
File "/app/stable-diffusion-webui/launch.py", line 39, in main
prepare_environment()
File "/app/stable-diffusion-webui/modules/launch_utils.py", line 388, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
I got the same issue, with python 3.10.6.
stable-diffusion-webui
assumes that you have installed on your computer the Cuda Toolkit. Make sure you have a recent version of it, installed, then run again webui-user
@cgstag Thanks your reply. But I have installed Cuda and cuda toolkit:
nvidia-cuda-dev/noble,now 12.0.146~12.0.1-4build4 amd64 [installed]
nvidia-cuda-gdb/noble,now 12.0.140~12.0.1-4build4 amd64 [installed]
nvidia-cuda-toolkit/noble,now 12.0.140~12.0.1-4build4 amd64 [installed]
nvidia-cuda-toolkit-doc/noble,now 12.0.1-4build4 all [installed]
nvidia-cuda-toolkit-gcc/noble,now 12.0.1-4build4 amd64 [installed]
python-pycuda-doc/noble,now 2024.1~dfsg-1build2 all [installed]
python3-pycuda/noble,now 2024.1~dfsg-1build2 amd64 [installed]
I'm using Ubuntu 24.04, the output of nv-smi is as following:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3080 Off | 00000000:17:00.0 Off | N/A |
| 0% 37C P8 14W / 320W | 12MiB / 10240MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 3080 Off | 00000000:65:00.0 On | N/A |
| 0% 49C P8 12W / 320W | 378MiB / 10240MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2980 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 2980 G /usr/lib/xorg/Xorg 159MiB |
| 1 N/A N/A 3293 G /usr/bin/gnome-shell 66MiB |
| 1 N/A N/A 4596 G ...98,262144 --variations-seed-version 102MiB |
+---------------------------------------------------------------------------------------+
sunyi@sunyi-station-ai:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0
I have no idea what's wrong here. It bothered me for a while.
The error message:
sunyi@sunyi-station-ai:~/AI/stable-diffusion-webui-bak$ ./webui.sh
################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################
################################################################
Running on sunyi user
################################################################
################################################################
Repo already cloned, using it as install directory
################################################################
################################################################
Create and activate python venv
################################################################
################################################################
Launching launch.py...
################################################################
glibc version is 2.39
Cannot locate TCMalloc. Do you have tcmalloc or google-perftool installed on your system? (improves CPU memory usage)
Python 3.10.6 (main, Jun 14 2024, 23:52:41) [GCC 13.2.0]
Version: v1.9.4
Commit hash: feee37d75f1b168768014e4634dcb156ee649c05
Traceback (most recent call last):
File "/home/sunyi/AI/stable-diffusion-webui-bak/launch.py", line 48, in <module>
main()
File "/home/sunyi/AI/stable-diffusion-webui-bak/launch.py", line 39, in main
prepare_environment()
File "/home/sunyi/AI/stable-diffusion-webui-bak/modules/launch_utils.py", line 386, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
I figured it out. The "CUDA Version: 12.2 " dumped by nvidia-smi doesn't mean I have installed the CUDA, which only a reference of available CUDA version. So I installed cuda toolkits, the issue solved. download from: https://developer.nvidia.com/cuda-toolkit-archive
trying to run A1111 version 1.94. (the a1111 installer is getting worse and worse with every release unfortunately) same error "Torch is not able to use GPU" even though I have webui-forge running on this same machine. So I know this error message is not specific enough. It would be helpful to know what is throwing that error and what exactly it's looking for and where.
Please use python3.10 and also make sure to downgrade numpy==1.26.4. I think it is numpy 2.x caused issue
Checklist
What happened?
Steps to reproduce the problem
x
What should have happened?
x
What browsers do you use to access the UI ?
Google Chrome
Sysinfo
CentOS Linux release 7.9.2009 (Core) 5.4.275-1.el7.elrepo.x86_64
Console logs
Additional information
No response