bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.11k stars 610 forks source link

suddenly getting this error after everything was running fine last night running windows 10 and this is effecting both fooocus and vlad #907

Closed Windrider30 closed 6 months ago

Windrider30 commented 9 months ago

===================================BUG REPORT=================================== C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\cuda_setup\main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

warn(msg)

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')} DEBUG: Possible options found for libcudart.so: set() CUDA SETUP: PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: 6.1. CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\cuda_setup\main.py:166: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU! If you run into issues with 8-bit matmul, you can try 4-bit quantization: https://huggingface.co/blog/4bit-transformers-bitsandbytes warn(msg) CUDA SETUP: Required library version not found: libbitsandbytes_cuda121_nocublaslt.so. Maybe you need to compile it from source? CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...

================================================ERROR===================================== CUDA SETUP: CUDA detection failed! Possible reasons:

  1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
  2. CUDA driver not installed
  3. CUDA not installed
  4. You have multiple conflicting CUDA libraries
  5. Required library not pre-compiled for this bitsandbytes release! CUDA SETUP: If you compiled from source, try again with make CUDA_VERSION=DETECTED_CUDA_VERSION for example, make CUDA_VERSION=113. CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via conda list | grep cuda.

CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected. CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null CUDA SETUP: Solution 1b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a CUDA SETUP: Solution 1c): For a permanent solution add the export from 1b into your .bashrc file, located at ~/.bashrc CUDA SETUP: Solution 2: If no library was found in step 1a) you need to install CUDA. CUDA SETUP: Solution 2a): Download CUDA install script: wget https://github.com/TimDettmers/bitsandbytes/blob/main/cuda_install.sh CUDA SETUP: Solution 2b): Install desired CUDA version to desired location. The syntax is bash cuda_install.sh CUDA_VERSION PATH_TO_INSTALL_INTO. CUDA SETUP: Solution 2b): For example, "bash cuda_install.sh 113 ~/local/" will download CUDA 11.3 and install into the folder ~/local CUDA SETUP: Setup Failed! Traceback (most recent call last): File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 187, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 146, in _get_module_details return _get_module_details(pkg_main_name, error) File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 110, in _get_module_details import(pkg_name) File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes__init__.py", line 6, in from . import cuda_setup, utils, research File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\research__init.py", line 1, in from . import nn File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\research\nn\init.py", line 1, in from .modules import LinearFP8Mixed, LinearFP8Global File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\research\nn\modules.py", line 8, in from bitsandbytes.optim import GlobalOptimManager File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\optim\init__.py", line 6, in from bitsandbytes.cextension import COMPILED_WITH_CUDA File "C:\Users\Dennis\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes-0.41.3-py3.10.egg\bitsandbytes\cextension.py", line 20, in raise RuntimeError(''' RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
poedator commented 9 months ago

Got same error with 0.41.3.post1 when trying to test 4bit serialization. 0.41.3. works ok. installed and reinstalled twice. @Titus-von-Koeller please take a look

E           CUDA Setup failed despite GPU being available. Please run the following command to get more information:
E   
E           python -m bitsandbytes
E   
E           Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
E           to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
E           and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
------------------------------------------------------------------------------------------------------------------------------------------------- Captured stdout -------------------------------------------------------------------------------------------------------------------------------------------------
False

===================================BUG REPORT===================================
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/nix/var/nix/profiles/default /home/optimus/.nix-profile')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: 8.0.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so...
/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so)
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121
python setup.py install
================================================================================================================================================ warnings summary =================================================================================================================================================
../conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:166
  /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run

  python -m bitsandbytes

    warn(msg)
Titus-von-Koeller commented 9 months ago

@Windrider30 Did you go through the solution steps? When did this issue start occurring? Did anything change, i.e. bnb/pytorch version, CUDA driver, CUDA installation?

@poedator's issue has to be unrelated, since 0.41.3.post1 was released only 4 hours ago. @poedator, could you please run python -m bitsandbytes and post the results here? On your system everything stayed the same then?

Two things that catch my eye in your case are that /usr/local/cuda/lib64/libcudart.so doesn't seem to be in the LD_LIBRARY_PATH and that

CUDA SETUP: Loading binary /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so...
/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so)

which to me means there's a version mismatch.

Please check ldd /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so

and strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX.

poedator commented 9 months ago

@Titus-von-Koeller, I installed and tested 0.41.3.post1 right after you published it and sent me a message at trasnformers github.

here is my output with 0.41.3.post1:

 pip install bitsandbytes==0.41.3.post1 -i https://pypi.org/simple --no-cache-dir 
Collecting bitsandbytes==0.41.3.post1
  Downloading bitsandbytes-0.41.3.post1-py3-none-any.whl.metadata (9.8 kB)
Downloading bitsandbytes-0.41.3.post1-py3-none-any.whl (92.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 92.6/92.6 MB 60.1 MB/s eta 0:00:00
Installing collected packages: bitsandbytes
  Attempting uninstall: bitsandbytes
    Found existing installation: bitsandbytes 0.41.3
    Uninstalling bitsandbytes-0.41.3:
      Successfully uninstalled bitsandbytes-0.41.3
Successfully installed bitsandbytes-0.41.3.post1
(py38) optimus@terranova:~/speculative$ python -m bitsandbytes
False

===================================BUG REPORT===================================
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

  warn(msg)
================================================================================
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /home/optimus/conda/envs/py38 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/home/optimus/local/cuda-11.8/lib64/libcudart.so'), PosixPath('/home/optimus/local/cuda-11.8/lib64/libcudart.so.11.0')}.. We select the PyTorch default libcudart.so, which is {torch.version.cuda},but this might missmatch with the CUDA version that is needed for bitsandbytes.To override this behavior set the BNB_CUDA_VERSION=<version string, e.g. 122> environmental variableFor example, if you want to use the CUDA version 122BNB_CUDA_VERSION=122 python ...OR set the environmental variable in your .bashrc: export BNB_CUDA_VERSION=122In the case of a manual override, make sure you set the LD_LIBRARY_PATH, e.g.export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2
  warn(msg)
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: :/home/optimus/local/cuda-11.8/lib64:/home/optimus/local/cuda-11.8/lib64:/home/optimus/local/cuda-11.8/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('//localhost'), PosixPath('25078'), PosixPath('http')}
The following directories listed in your path were found to be non-existent: {PosixPath('/nix/var/nix/profiles/default /home/optimus/.nix-profile')}
The following directories listed in your path were found to be non-existent: {PosixPath('kF=\\E[1;2B'), PosixPath('k8=\\E[19~'), PosixPath('rs=\\Ec'), PosixPath('k9=\\E[20~'), PosixPath('xv'), PosixPath('bl=^G'), PosixPath('F2=\\E[24~'), PosixPath('LP'), PosixPath('ve=\\E[34h\\E[?25h'), PosixPath('F6=\\E[1;2S'), PosixPath('co#172'), PosixPath('md=\\E[1m'), PosixPath('k;=\\E[21~'), PosixPath('DL=\\E[%dM'), PosixPath('kR=\\E[1;2A'), PosixPath('AB=\\E[4%dm'), PosixPath('F4=\\E[1;2Q'), PosixPath('%e=\\E[5;2~'), PosixPath('kl=\\EOD'), PosixPath('k5=\\E[15~'), PosixPath('#4=\\E[1;2D'), PosixPath('bs'), PosixPath('as=\\E(0'), PosixPath('mh=\\E[2m'), PosixPath('cr=^M'), PosixPath('IC=\\E[%d@'), PosixPath('DC=\\E[%dP'), PosixPath('k7=\\E[18~'), PosixPath('@1=\\E[1~'), PosixPath('k2=\\EOQ'), PosixPath('ue=\\E[24m'), PosixPath('cl=\\E[H\\E[J'), PosixPath('Co#8'), PosixPath('SC|screen.xterm-256color|VT 100/ANSI X3.64 virtual terminal'), PosixPath('*4=\\E[3;2~'), PosixPath('K2=\\EOE'), PosixPath('kI=\\E[2~'), PosixPath('te=\\E[?1049l'), PosixPath('Km=\\E[M'), PosixPath('pf=\\E[4i'), PosixPath('do=^J'), PosixPath('so=\\E[3m'), PosixPath('k6=\\E[17~'), PosixPath('it#8'), PosixPath('pa#64'), PosixPath('#3=\\E[2;2~'), PosixPath('FD=\\E[23;2~'), PosixPath('F5=\\E[1;2R'), PosixPath('#2=\\E[1;2H'), PosixPath('kb=\x7f'), PosixPath('kh=\\E[1~'), PosixPath('vi=\\E[?25l'), PosixPath('op=\\E[39;49m'), PosixPath('ct=\\E[3g'), PosixPath('cm=\\E[%i%d;%dH'), PosixPath('se=\\E[23m'), PosixPath('%c=\\E[6;2~'), PosixPath('kd=\\EOB'), PosixPath('FC=\\E[21;2~'), PosixPath('ms'), PosixPath('mi'), PosixPath('is=\\E)0'), PosixPath('po=\\E[5i'), PosixPath('al=\\E[L'), PosixPath('xn'), PosixPath('AL=\\E[%dL'), PosixPath('up=\\EM'), PosixPath('kN=\\E[6~'), PosixPath('us=\\E[4m'), PosixPath('ce=\\E[K'), PosixPath('FB=\\E[20;2~'), PosixPath('ks=\\E[?1h\\E='), PosixPath('sr=\\EM'), PosixPath('im=\\E[4h'), PosixPath('ei=\\E[4l'), PosixPath('nd=\\E[C'), PosixPath('sc=\\E7'), PosixPath('ke=\\E[?1l\\E>'), PosixPath('mr=\\E[7m'), PosixPath('DO=\\E[%dB'), PosixPath('dc=\\E[P'), PosixPath('k4=\\EOS'), PosixPath('vs=\\E[34l'), PosixPath('UP=\\E[%dA'), PosixPath('F8=\\E[17;2~'), PosixPath('pt'), PosixPath('%i=\\E[1;2C'), PosixPath('G0'), PosixPath('k1=\\EOP'), PosixPath('kP=\\E[5~'), PosixPath('kH=\\E[4~'), PosixPath('dl=\\E[M'), PosixPath('FA=\\E[19;2~'), PosixPath('li#41'), PosixPath('mb=\\E[5m'), PosixPath('kB=\\E[Z'), PosixPath('\\\n\t'), PosixPath('ti=\\E[?1049h'), PosixPath('km'), PosixPath('kr=\\EOC'), PosixPath('cd=\\E[J'), PosixPath('ho=\\E[H'), PosixPath('@7=\\E[4~'), PosixPath('nw=\\EE'), PosixPath('F1=\\E[23~'), PosixPath('*7=\\E[1;2F'), PosixPath('kD=\\E[3~'), PosixPath('LE=\\E[%dD'), PosixPath('le=^H'), PosixPath('ac=\\140\\140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~..--++,,hhII00'), PosixPath('k0=\\E[10~'), PosixPath('k3=\\EOR'), PosixPath('rc=\\E8'), PosixPath('ta=^I'), PosixPath('F7=\\E[15;2~'), PosixPath('F3=\\E[1;2P'), PosixPath('AX'), PosixPath('st=\\EH'), PosixPath('vb=\\Eg'), PosixPath('cs=\\E[%i%d;%dr'), PosixPath('AF=\\E[3%dm'), PosixPath('ku=\\EOA'), PosixPath('RI=\\E[%dC'), PosixPath('me=\\E[m'), PosixPath('F9=\\E[18;2~'), PosixPath('am'), PosixPath('bt=\\E[Z'), PosixPath('FE=\\E[24;2~'), PosixPath('ae=\\E(B')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: 8.0.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so...
/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so)
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121
python setup.py install
Traceback (most recent call last):
  File "/home/optimus/conda/envs/py38/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/home/optimus/conda/envs/py38/lib/python3.8/runpy.py", line 144, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/home/optimus/conda/envs/py38/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/cextension.py", line 20, in <module>
    raise RuntimeError('''
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

And same with 0.41.3:

$ python -m bitsandbytes
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++ ANACONDA CUDA PATHS ++++++++++++++++++++
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/torch/lib/libtorch_cuda_linalg.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/torch/lib/libc10_cuda.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/autogptq_cuda_256.cpython-38-x86_64-linux-gnu.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda120_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda111.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda122.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda111_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda120.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda114.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda118_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda115.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda115_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda110.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda122_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda114_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda110_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so
/home/optimus/conda/envs/py38/lib/python3.8/site-packages/autogptq_cuda_64.cpython-38-x86_64-linux-gnu.so

++++++++++++++++++ /usr/local CUDA PATHS +++++++++++++++++++
/usr/local/cuda-12.2/targets/x86_64-linux/lib/libcudart.so
/usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs/libcuda.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudart.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/stubs/libcuda.so
/usr/local/cuda-11.2/nsight-compute-2020.3.0/target/linux-desktop-glibc_2_19_0-ppc64le/libcuda-injection.so
/usr/local/cuda-11.2/nsight-compute-2020.3.0/target/linux-desktop-t210-a64/libcuda-injection.so
/usr/local/cuda-11.2/nsight-compute-2020.3.0/target/linux-desktop-glibc_2_11_3-x64/libcuda-injection.so
/usr/local/cuda-11.1/nsight-compute-2020.2.1/target/linux-desktop-glibc_2_19_0-ppc64le/libcuda-injection.so
/usr/local/cuda-11.1/nsight-compute-2020.2.1/target/linux-desktop-t210-a64/libcuda-injection.so
/usr/local/cuda-11.1/nsight-compute-2020.2.1/target/linux-desktop-glibc_2_11_3-x64/libcuda-injection.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/stubs/libcuda.so

+++++++++++++++ WORKING DIRECTORY CUDA PATHS +++++++++++++++

++++++++++++++++++ LD_LIBRARY CUDA PATHS +++++++++++++++++++
++++++ /home/optimus/local/cuda-11.8/lib64 CUDA PATHS ++++++

++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = True
COMPUTE_CAPABILITIES_PER_GPU = ['8.0', '8.0', '8.0', '8.0', '8.0', '8.0']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Running a quick check that:
    + library is importable
    + CUDA function is callable

WARNING: Please be sure to sanitize sensible info from any such env vars!

SUCCESS!
Installation was successful!
Windrider30 commented 9 months ago

@Windrider30 Did you go through the solution steps? When did this issue start occurring? Did anything change, i.e. bnb/pytorch version, CUDA driver, CUDA installation?

@poedator's issue has to be unrelated, since 0.41.3.post1 was released only 4 hours ago. @poedator, could you please run python -m bitsandbytes and post the results here? On your system everything stayed the same then?

Two things that catch my eye in your case are that /usr/local/cuda/lib64/libcudart.so doesn't seem to be in the LD_LIBRARY_PATH and that

CUDA SETUP: Loading binary /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so...
/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so)

which to me means there's a version mismatch.

Please check ldd /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so

and strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX.

I went trhough unstailled EVERYTHING including python reinstalled step by step and still hit the same issue, but what is weird as hell is on my main pc (same set up for everything) it works fine someone said it could be the way torch was loaded but still no joy in mudville and I dont even know why it started having this issue in the first place. But so far i am still getting the same error and not sure what to do hell can't even run the bat file on auto111 keeps locking the lap top up for some reason .

poedator commented 9 months ago

@Titus-von-Koeller , below it the command output. I have system Driver Version: 535.104.12, CUDA Driver Version: 12.2, and cuda 12.1 in my conda env. Even if there is a mismatch or whatever else - what is different btw 0.41..3 and 0.41.3.1 so it fails?

ldd /home/optimus/conda/envs/py38/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda121.so
        linux-vdso.so.1 (0x00007fffe6987000)
        libcudart.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.12 (0x00007f33bd897000)
        libcublas.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so.12 (0x00007f33b70d3000)
        libcublasLt.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.12 (0x00007f3394186000)
        libcusparse.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcusparse.so.12 (0x00007f33843e6000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f338405d000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3383cbf000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3383aa7000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f33836b6000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f33beaf3000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f33834b2000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3383293000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f338308b000)
        libnvJitLink.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libnvJitLink.so.12 (0x00007f337fec3000)

strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBCXX_3.4.21
GLIBCXX_3.4.22
GLIBCXX_3.4.23
GLIBCXX_3.4.24
GLIBCXX_3.4.25
GLIBCXX_DEBUG_MESSAGE_LENGTH
Titus-von-Koeller commented 9 months ago

@poedator Ok, do I understand you correctly that you changed nothing else and 0.41.3.post1 failed. Then you downgraded to 0.41.3 and it works fully again? Actually the only thing that (should have) changed are two lines in Params4bit.. Nothing to do with the setup logic at all. Please confirm this.

Will look at the ldd output with fresh eyes tmr.

@Windrider30 For you, could you please confirm that it also doesn't work with pinning to the bnb version that worked for you before?

I went trhough unstailled EVERYTHING including python reinstalled step by step and still hit the same issue

But to me this doesn't seem related to the Python + bnb install, but to the cuda install or bnbs lack of finding it.

Do computations work fine with vanilla PyTorch on GPU?

Did you go through the solution steps?

What I mean by this is the steps proposed in the debug output:

CUDA SETUP: CUDA detection failed! Possible reasons:

You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA driver not installed
CUDA not installed
You have multiple conflicting CUDA libraries
Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with make CUDA_VERSION=DETECTED_CUDA_VERSION for example, make CUDA_VERSION=113.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via conda list | grep cuda.
================================================================================
CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected.
CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable
CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null
CUDA SETUP: Solution 1b): Once the library is in found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a
CUDA SETUP: Solution 1c): For a permanent solution add the export from 1b into your .bashrc file, located at ~/.bashrc
CUDA SETUP: Solution 2: If no library was found in step 1a) you need to install CUDA.
CUDA SETUP: Solution 2a): Download CUDA install script: wget https://github.com/TimDettmers/bitsandbytes/blob/main/cuda_install.sh
CUDA SETUP: Solution 2b): Install desired CUDA version to desired location. The syntax is bash cuda_install.sh CUDA_VERSION PATH_TO_INSTALL_INTO.
CUDA SETUP: Solution 2b): For example, "bash cuda_install.sh 113 ~/local/" will download CUDA 11.3 and install into the folder ~/local
poedator commented 9 months ago

@poedator Ok, do I understand you correctly that you changed nothing else and 0.41.3.post1 failed. Then you downgraded to 0.41.3 and it works fully again? Actually the only thing that (should have) changed are two lines in Params4bit.. Nothing to do with the setup logic at all. Please confirm this.

@Titus-von-Koeller this is correct. I only uninstalled different versions of bnb into the same env.

I also tried to install both versions into my other env with older cuda==11.8. same results (post1 craches). logs here

is it possible that the two versions placed to pypi were compiled differently?

I tried to install the latest bnb from sources using cuda 11.8 env - it worked fine. logs here

Titus-von-Koeller commented 9 months ago

Well, the only difference is that the very last one (post1) was compiled on my dev machine, according to Tim's instructions, whereas all the prior ones were compiled by Tim on his.

At this point, I would like to pull in @TimDettmers, even though he's pretty busy these days. Let's see if we can get a hold of him despite NeurIPS..

Titus-von-Koeller commented 9 months ago

I contacted him about it. Let's see.

Another weird thing is that you were the only one raising this since the release and I think we have about 3m downloads per month.

Titus-von-Koeller commented 9 months ago

Ok, Tim got back to me and has just pushed a new patch. @poedator, could you please verify if it is working for you now?

According to Tim, there seems to have been a compiler version mismatch between the g++ version I used for compiling and the GCC version available on the users' systems: The software was developed using GCC version 9.1. However, some users, particularly those on Ubuntu 18.04, might have older versions of GCC by default (e.g., GCC 7.4), leading to compatibility issues.

If the CUDA library is compiled with a newer C++ standard than what is available on the user's system, it results in runtime errors.

I saw in your logs that you're on NixOS? I thought that means that such toolchains are more bleeding edge? In that light it's surprising to me that you'd have an older gcc.

Please let me know your thoughts and if Tim's republishing solved the issue.

poedator commented 9 months ago

Hi, @Titus-von-Koeller , the fresh 0.41.3.post2 works OK. Thanks to you and @TimDettmers for the quick reaction. My system is a big shared server with Ubuntu 18.04 and tons of stuff. perhaps it is time for distro upgrade, yet so many reasons to postpone it.

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.