bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.06k stars 610 forks source link

RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information: #1204

Open Worromots opened 4 months ago

Worromots commented 4 months ago

System Info

False

===================================BUG REPORT=================================== /home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

warn(msg)

The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/extras/CUPTI/lib64'), PosixPath('/usr/local/nvidia/cpu_lib'), PosixPath('/usr/local/cuda/lib'), PosixPath('/usr/local/lib/python3.9/site-packages/torchvision-0.14.1-py3.9-linux-x86_64.egg/torchvision'), PosixPath('/opt/rh/devtoolset-8/root/usr/lib/dyninst')} /home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: /opt/rh/devtoolset-8/root/usr/lib64:/opt/rh/devtoolset-8/root/usr/lib:/opt/rh/devtoolset-8/root/usr/lib64/dyninst:/opt/rh/devtoolset-8/root/usr/lib/dyninst:/opt/rh/devtoolset-8/root/usr/lib64:/opt/rh/devtoolset-8/root/usr/lib:/usr/local/cuda/compat/:/opt/tritonserver/lib:/opt/tritonserver/lib64:/usr/local/lib/python3.9/site-packages/torch/lib:/usr/local/lib/python3.9/site-packages/torchvision-0.14.1-py3.9-linux-x86_64.egg/torchvision:/usr/local/nvidia/cpu_lib:/usr/local/lib::/usr/local/TensorRT-release/lib:/usr/local/java/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/lib::/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/compat/:/opt/tritonserver/lib:/opt/tritonserver/lib64:/usr/local/lib/python3.9/site-packages/torch/lib:/usr/local/lib/python3.9/site-packages/torchvision-0.14.1-py3.9-linux-x86_64.egg/torchvision:/usr/local/nvidia/cpu_lib:/usr/local/lib::/usr/local/TensorRT-release/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/java/jre/lib/amd64/server:/opt/meituan/hadoop/lib/native did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths... warn(msg) The following directories listed in your path were found to be non-existent: {PosixPath('443'), PosixPath('//10.220.17.1'), PosixPath('tcp')} The following directories listed in your path were found to be non-existent: {PosixPath('//s3plus.sankuai.com/v1/mss_f98ae29a284a4de8952b082c29b58dfb/ml-inf-public/packages/Bazel'), PosixPath('https')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/rh/devtoolset-8/root/usr/lib/perl5'), PosixPath('/opt/rh/devtoolset-8/root/usr/share/perl5/vendor_perl')} The following directories listed in your path were found to be non-existent: {PosixPath('-Djava.security.krb5.conf=/opt/meituan/hadoop/etc/hadoop/krb5.conf')} The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('//config.hulk.vip.sankuai.com')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/meituan/hadoop/contrib/capacity-scheduler/.jar')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/rh/devtoolset-8/root/usr/lib64/pkgconfig')} The following directories listed in your path were found to be non-existent: {PosixPath('443'), PosixPath('//10.220.17.1'), PosixPath('tcp')} The following directories listed in your path were found to be non-existent: {PosixPath('() { eval `/usr/bin/modulecmd bash $\n}')} CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')} CUDA SETUP: PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: 8.9. CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md CUDA SETUP: Loading binary /home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda121.so... /usr/lib64/libstdc++.so.6: versionCXXABI_1.3.9' not found (required by /home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda121.so) CUDA SETUP: Something unexpected happened. Please compile from source: git clone https://github.com/TimDettmers/bitsandbytes.git cd bitsandbytes CUDA_VERSION=121 python setup.py install Traceback (most recent call last): File "/usr/local/lib/python3.9/runpy.py", line 188, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/local/lib/python3.9/runpy.py", line 147, in _get_module_details return _get_module_details(pkg_main_name, error) File "/usr/local/lib/python3.9/runpy.py", line 111, in _get_module_details import(pkg_name) File "/home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/init.py", line 6, in from . import cuda_setup, utils, research File "/home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/research/init.py", line 1, in from . import nn File "/home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/research/nn/init.py", line 1, in from .modules import LinearFP8Mixed, LinearFP8Global File "/home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/research/nn/modules.py", line 8, in from bitsandbytes.optim import GlobalOptimManager File "/home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/optim/init.py", line 6, in from bitsandbytes.cextension import COMPILED_WITH_CUDA File "/home/hadoop-platcv/.local/lib/python3.9/site-packages/bitsandbytes/cextension.py", line 20, in raise RuntimeError(''' RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

sys info: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Aug_15_22:02:13_PDT_2023 Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0 NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"

Reproduction

python3 -m bitsandbytes

Expected behavior

tanyo520 commented 4 months ago

+1

lishujun-v commented 4 months ago

+1

matthewdouglas commented 4 months ago

Hi all, For CentOS 7, you'll likely need to either compile from source or pin to bitsandbytes<0.43.0 for the time being, as since 0.43.0 it is built with compatibility only for >= manylinux_2_24.

It's important to note that CentOS 7 will reach EOL on 2024-06-30.