bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.04k stars 606 forks source link

CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes #1144

Open tejarao1156 opened 5 months ago

tejarao1156 commented 5 months ago

System Info

GOOGLE COLLAB PRO V100 GPU

Reproduction

===================================BUG REPORT===================================
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
The following directories listed in your path were found to be non-existent: {PosixPath('8013'), PosixPath('http'), PosixPath('//172.28.0.1')}
The following directories listed in your path were found to be non-existent: {PosixPath('--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-v100-hm-h2ery00lgftj --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')}
The following directories listed in your path were found to be non-existent: {PosixPath('/datalab/web/pyright/typeshed-fallback/stdlib,/usr/local/lib/python3.10/dist-packages')}
The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
The following directories listed in your path were found to be non-existent: {PosixPath('//ipykernel.pylab.backend_inline'), PosixPath('module')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=117, Highest Compute Capability: 7.0.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so...
libcusparse.so.11: cannot open shared object file: No such file or directory
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=117 make cuda11x_nomatmul
python setup.py install
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

  warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths...
  warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!                     If you run into issues with 8-bit matmul, you can try 4-bit quantization: https://huggingface.co/blog/4bit-transformers-bitsandbytes
  warn(msg)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py](https://localhost:8080/#) in _get_module(self, module_name)
   1471         try:
-> 1472             return importlib.import_module("." + module_name, self.__name__)
   1473         except Exception as e:

22 frames
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py](https://localhost:8080/#) in _get_module(self, module_name)
   1472             return importlib.import_module("." + module_name, self.__name__)
   1473         except Exception as e:
-> 1474             raise RuntimeError(
   1475                 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
   1476                 f" traceback):\n{e}"

RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues===================================BUG REPORT===================================
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
The following directories listed in your path were found to be non-existent: {PosixPath('8013'), PosixPath('http'), PosixPath('//172.28.0.1')}
The following directories listed in your path were found to be non-existent: {PosixPath('--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-v100-hm-h2ery00lgftj --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')}
The following directories listed in your path were found to be non-existent: {PosixPath('/datalab/web/pyright/typeshed-fallback/stdlib,/usr/local/lib/python3.10/dist-packages')}
The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
The following directories listed in your path were found to be non-existent: {PosixPath('//ipykernel.pylab.backend_inline'), PosixPath('module')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=117, Highest Compute Capability: 7.0.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so...
libcusparse.so.11: cannot open shared object file: No such file or directory
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=117 make cuda11x_nomatmul
python setup.py install
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

  warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths...
  warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:183: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!                     If you run into issues with 8-bit matmul, you can try 4-bit quantization: https://huggingface.co/blog/4bit-transformers-bitsandbytes
  warn(msg)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py](https://localhost:8080/#) in _get_module(self, module_name)
   1471         try:
-> 1472             return importlib.import_module("." + module_name, self.__name__)
   1473         except Exception as e:

22 frames
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py](https://localhost:8080/#) in _get_module(self, module_name)
   1472             return importlib.import_module("." + module_name, self.__name__)
   1473         except Exception as e:
-> 1474             raise RuntimeError(
   1475                 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
   1476                 f" traceback):\n{e}"

RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Expected behavior

CODE EXECUTION

s-ravi18 commented 5 months ago

Same error I am also getting, can anyone suggest a way for this?

johnga1995 commented 5 months ago

I had this issue while trying to get Kohya_ss setup and managed to solve it.

My system has cuda toolkit version 12.2. NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2. If you have a lower driver version (i.e. ~320 with cuda 11.8) it should work just by installing bitsandbytes with pip. But I could not downgrade my driver version and the default installation of bitsandbytes does not seem to work with cuda 12.2.

So I managed to fix it with the help of https://github.com/TimDettmers/bitsandbytes/issues/551#issuecomment-1621292363

I did have to do a few extra steps.

Basically made sure to uninstall bitsandbytes and bitsandbytes_windows from my venv environment and build it from source (as shown in the link above).

Since I had cuda 12.2, I had to specify the cuda version in the build process (refer to link) as CUDA_VERSION=122 make cuda12x

With this i generated a file libbitsandbytes_cuda122.so. This file I had to copy it to the venv python library path for bitsandbytes .

Also, on the venv, I had to set environment variables with commands export BNB_CUDA_VERSION=122 and export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs (Note that this is because I have cuda 12.2)

Finallly after running python -m bitsandbytes, I did not get any more errors and got a Installation was successful!.

deep-pipeline commented 5 months ago

@tejarao1156 and @s-ravi18 did the above advice help you? if so please say (or clarify any additions) so that this issue could be closed.