bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.35k stars 637 forks source link

libstdc++.so.6 error #1139

Open a-turcu opened 8 months ago

a-turcu commented 8 months ago

System Info

bitsandbytes==0.43.0 torch==2.1.0+cu121 Linux

Reproduction

import bitsandbytes returns the following error stack:

Error trace (click me) ``` The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib64')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/dataiku-dss-12.3.2/spark-standalone-home')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/dataiku-dss-12.3.2')} The following directories listed in your path were found to be non-existent: {PosixPath('tcp'), PosixPath('//172.20.0.1'), PosixPath('443')} The following directories listed in your path were found to be non-existent: {PosixPath('/usr/bin/R')} The following directories listed in your path were found to be non-existent: {PosixPath('+PrintGCTimeStamps '), PosixPath('/dev/stderr -XX'), PosixPath('-Xmx2g -XX'), PosixPath('+UseG1GC -Xloggc')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/jupyter-run/jupyter/runtime')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/dataiku-dss-12.3.2/spark-standalone-home/python/lib/py4j-0.10.9.7-src.zip'), PosixPath('/opt/dataiku-dss-12.3.2/spark-standalone-home/python')} The following directories listed in your path were found to be non-existent: {PosixPath('-ea -Dfile.encoding=utf8 -Djava.awt.headless=true -Djava.io.tmpdir=/data/dataiku/dss_data/tmp -Djava.security.egd=file'), PosixPath('/dev/urandom -Djdk.http.auth.tunneling.disabledSchemes= -Djdk.http.auth.proxying.disabledSchemes=')} The following directories listed in your path were found to be non-existent: {PosixPath('https'), PosixPath('//artifacts.kpn.org/api/pypi/pypi/simple')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/fek')} The following directories listed in your path were found to be non-existent: {PosixPath('/usr/bin/R')} The following directories listed in your path were found to be non-existent: {PosixPath('ParallelGCThreads=8 -Xloggc'), PosixPath('+PrintGCDetails -XX'), PosixPath('-Xmx2g -XX'), PosixPath('/dev/stderr -XX'), PosixPath('+UseParallelGC -XX'), PosixPath('+PrintGCTimeStamps ')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/python')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/R.lib')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/dataiku-dss-12.3.2/hadoop-standalone-libs/*'), PosixPath('/opt/dataiku-dss-12.3.2/lib/shims/*')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/code-envs/julia')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/run')} The following directories listed in your path were found to be non-existent: {PosixPath('ParallelGCThreads=8 -Xloggc'), PosixPath('+PrintGCDetails -XX'), PosixPath('-Xmx2g -XX'), PosixPath('/dev/stderr -XX'), PosixPath('+UseParallelGC -XX'), PosixPath('+PrintGCTimeStamps ')} The following directories listed in your path were found to be non-existent: {PosixPath('/opt/dataiku-dss-12.3.2/R/4.x'), PosixPath('/data/dataiku/dss_data/R.lib')} The following directories listed in your path were found to be non-existent: {PosixPath('() { ( alias;\n eval ${which_declare} ) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot $@\n}')} The following directories listed in your path were found to be non-existent: {PosixPath('/usr/bin/java')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/dku')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/jupyter-run/ipython')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/cak')} The following directories listed in your path were found to be non-existent: {PosixPath('ParallelGCThreads=8 -Xloggc'), PosixPath('+PrintGCDetails -XX'), PosixPath('-Xmx2g -XX'), PosixPath('/dev/stderr -XX'), PosixPath('+UseParallelGC -XX'), PosixPath('+PrintGCTimeStamps ')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/jupyter-run/jupyter')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/hproxy')} The following directories listed in your path were found to be non-existent: {PosixPath('//debuginfod.centos.org/ '), PosixPath('https')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/python')} The following directories listed in your path were found to be non-existent: {PosixPath('ParallelGCThreads=8 -Xloggc'), PosixPath('+PrintGCDetails -XX'), PosixPath('-Xmx2g -XX'), PosixPath('/dev/stderr -XX'), PosixPath('+UseParallelGC -XX'), PosixPath('+PrintGCTimeStamps ')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/jupyter-run/jupyter')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/run/svd.sock'), PosixPath('unix')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/bin/jek')} The following directories listed in your path were found to be non-existent: {PosixPath('+PrintGCTimeStamps '), PosixPath('+UseG1GC -Xloggc'), PosixPath('/dev/stderr -XX'), PosixPath('-Xmx16g -XX')} The following directories listed in your path were found to be non-existent: {PosixPath('/data/dataiku/dss_data/tmp')} The following directories listed in your path were found to be non-existent: {PosixPath('tcp'), PosixPath('//172.20.0.1'), PosixPath('443')} The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//ipykernel.pylab.backend_inline')} CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')} CUDA SETUP: PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: 8.6. CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md CUDA SETUP: Loading binary /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda121.so... /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda121.so) CUDA SETUP: Something unexpected happened. Please compile from source: git clone https://github.com/TimDettmers/bitsandbytes.git cd bitsandbytes CUDA_VERSION=121 python setup.py install /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Welcome to bitsandbytes. For bug reports, please run python -m bitsandbytes warn(msg) /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: /usr/local/nvidia/lib64:/usr/local/cuda/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths... warn(msg) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/utils/import_utils.py in _get_module(self, module_name) 1389 try: -> 1390 return importlib.import_module("." + module_name, self.__name__) 1391 except Exception as e: /usr/local/lib/python3.9/importlib/__init__.py in import_module(name, package) 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128 /usr/local/lib/python3.9/importlib/_bootstrap.py in _gcd_import(name, package, level) /usr/local/lib/python3.9/importlib/_bootstrap.py in _find_and_load(name, import_) /usr/local/lib/python3.9/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_) /usr/local/lib/python3.9/importlib/_bootstrap.py in _load_unlocked(spec) /usr/local/lib/python3.9/importlib/_bootstrap_external.py in exec_module(self, module) /usr/local/lib/python3.9/importlib/_bootstrap.py in _call_with_frames_removed(f, *args, **kwds) /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/integrations/bitsandbytes.py in 10 if is_bitsandbytes_available(): ---> 11 import bitsandbytes as bnb 12 import torch /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/__init__.py in 5 ----> 6 from . import cuda_setup, utils, research 7 from .autograd._functions import ( /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/research/__init__.py in ----> 1 from . import nn 2 from .autograd._functions import ( 3 switchback_bnb, /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/research/nn/__init__.py in ----> 1 from .modules import LinearFP8Mixed, LinearFP8Global /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/research/nn/modules.py in 7 import bitsandbytes as bnb ----> 8 from bitsandbytes.optim import GlobalOptimManager 9 from bitsandbytes.utils import OutlierTracer, find_outlier_dims /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/optim/__init__.py in 5 ----> 6 from bitsandbytes.cextension import COMPILED_WITH_CUDA 7 /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/cextension.py in 19 CUDASetup.get_instance().print_log_stack() ---> 20 raise RuntimeError(''' 21 CUDA Setup failed despite GPU being available. Please run the following command to get more information: RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues The above exception was the direct cause of the following exception: RuntimeError Traceback (most recent call last) in () 4 5 # Load model ----> 6 model = AutoModelForCausalLM.from_pretrained(LLM, 7 device_map="auto", 8 trust_remote_code=False, /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 559 elif type(config) in cls._model_mapping.keys(): 560 model_class = _get_model_class(config, cls._model_mapping) --> 561 return model_class.from_pretrained( 562 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs 563 ) /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) 3387 3388 if hf_quantizer is not None: -> 3389 hf_quantizer.preprocess_model( 3390 model=model, device_map=device_map, keep_in_fp32_modules=keep_in_fp32_modules 3391 ) /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/quantizers/base.py in preprocess_model(self, model, **kwargs) 164 model.is_quantized = True 165 model.quantization_method = self.quantization_config.quant_method --> 166 return self._process_model_before_weight_loading(model, **kwargs) 167 168 def postprocess_model(self, model: "PreTrainedModel", **kwargs): /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/quantizers/quantizer_awq.py in _process_model_before_weight_loading(self, model, **kwargs) 75 76 def _process_model_before_weight_loading(self, model: "PreTrainedModel", **kwargs): ---> 77 from ..integrations import get_keys_to_not_convert, replace_with_awq_linear 78 79 self.modules_to_not_convert = get_keys_to_not_convert(model) /usr/local/lib/python3.9/importlib/_bootstrap.py in _handle_fromlist(module, fromlist, import_, recursive) /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/utils/import_utils.py in __getattr__(self, name) 1378 value = self._get_module(name) 1379 elif name in self._class_to_module.keys(): -> 1380 module = self._get_module(self._class_to_module[name]) 1381 value = getattr(module, name) 1382 else: /opt/dataiku/code-env/lib/python3.9/site-packages/transformers/utils/import_utils.py in _get_module(self, module_name) 1390 return importlib.import_module("." + module_name, self.__name__) 1391 except Exception as e: -> 1392 raise RuntimeError( 1393 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its" 1394 f" traceback):\n{e}" RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues ```

Expected behavior

Part of the error stack says:

CUDA SETUP: Loading binary /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda121.so...
/usr/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /opt/dataiku/code-env/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda121.so)

I read that this might be solved if I update libstdc++.so.6, but it is really not possible in my case. I also cannot install from source or change the libstdc++.so.6 file in any way. Has anyone ever encountered this?

matthewdouglas commented 8 months ago

What OS are you running on? bitsandbytes v0.43.0 wheels requires glibc 2.24, and it seems like you might be closer to ~2.17. It's interesting that pip installed the wheel when it is marked manylinux_2_24. Can you tell us what version of pip you have?

The other interesting thing I see is that your logs indicate attempts to load the CUDA 11.8 in one trace and 12.1 in another. Can you confirm the versions of PyTorch and bitsandbytes?

a-turcu commented 8 months ago

@matthewdouglas I am using CentOS Linux 7 (Core) and the glibc is indeed 2.17. The pip version is 23.3.1. As for the different CUDA versions, I have made a mistake when pasting the output. The CUDA version is 12.1. I rectified the post.

a-turcu commented 8 months ago

@matthewdouglas I am using CentOS Linux 7 (Core) and the glibc is indeed 2.17. The pip version is 23.3.1. As for the different CUDA versions, I have made a mistake when pasting the output. The CUDA version is 12.1. I rectified the post.

I have also tried CUDA 11.8 and I got the same error.

matthewdouglas commented 8 months ago

@a-turcu It's important to note that CentOS 7 will reach EOL in about 100 days.

It should be possible to compile for a system that old. You don't have to compile on the same host, so if you have another one capable of running Docker you could give that a try. You would want to start from an image like quay.io/pypa/manylinux2014_x86_64, ensure gcc >= 6, add the CUDA Toolkit, and build. From there you'd want to copy the resulting libbitsandbytes_cuda121.so to your target machine.

Titus-von-Koeller commented 8 months ago

It's a bit of a trade-off which manylinux standard we're aiming to support with BNB. It comes at a cost for us and there are very few users who this affects.

Currently, we're supporting manylinux_2_24, which already goes pretty far back for old distros.

Please check out the Github repo on that machine and build from source and do a pip install -e . in the cloned repo dir.

It would be great if you could report back to us if that solved your problem, just to rule out that we're on the wrong track with the manylinux stuff: might be a red herring.

You're feedback would help us make BNB better. Thanks!

a-turcu commented 8 months ago

Thank you for your replies. I hate to let you down, but I should give you more details about the situation. I am using the online data science platform Dataiku to work on my project, from a Windows computer. The machine running CentOS is on a Kubernetes pod and the only way I can modify anything about it is through a pip requirements file (company regulations). Also, I can only download packages from pypi. That being said, I figured out that I have read access to the CentOS through a python notebook, as Dataiku does not offer a terminal (horrible platform). I only use the CentOS pc because it has a GPU. I have asked the people responsible for the Kubernetes pods to help me with the issue and I will also forward them your advice. So, I dont think I can do this

@a-turcu It's important to note that CentOS 7 will reach EOL in about 100 days.

It should be possible to compile for a system that old. You don't have to compile on the same host, so if you have another one capable of running Docker you could give that a try. You would want to start from an image like quay.io/pypa/manylinux2014_x86_64, ensure gcc >= 6, add the CUDA Toolkit, and build. From there you'd want to copy the resulting libbitsandbytes_cuda121.so to your target machine.

or this

It's a bit of a trade-off which manylinux standard we're aiming to support with BNB. It comes at a cost for us and there are very few users who this affects.

Currently, we're supporting manylinux_2_24, which already goes pretty far back for old distros.

Please check out the Github repo on that machine and build from source and do a pip install -e . in the cloned repo dir.

It would be great if you could report back to us if that solved your problem, just to rule out that we're on the wrong track with the manylinux stuff: might be a red herring.

You're feedback would help us make BNB better. Thanks!

on my own. I am not incredibly proficient with Linux or Docker images, but, from your replies, I understand that both the OS and gcc should be updated.

cemiu commented 6 months ago

I also ran into that using RHEL 7 (on a managed HPC system I have no ability to upgrade) with glibc 2.17 (from ldd --version)

I managed to build from source, but I later discovered that the conda-forge distribution of bitsandbytes is built with glibc 2.17+ support (instead of pip's 2.24+).

So FYI, if you're running a legacy distribution and cannot build or want to save some time, try conda-forge instead.

amtam0 commented 6 months ago

Hi @cemiu, I'm using the same distribution (Dataiku EKS, CentOS 7, glibc 2.17), with CUDA version 11.2 and gcc 4.8.5. Could you please share the dockerfile or code used to building from source ? Thanks

cemiu commented 6 months ago

Hi @amtam0, I followed the build instructions in the documentation; without containers, just a pip environment.

Just make sure to have appropriate versions of gcc cmake and cuda, and you should be golden.

Orrrr, give anaconda/miniconda a shot.

amtam0 commented 6 months ago

Thanks @cemiu, I don't have admin role to run these commands, I will forward to our platform admins And curious how you did it with conda ? (not prioritizing this method because dataiku does not recommend using conda envs)