Could not load bitsandbytes native library: 'NoneType' object has no attribute 'split'

System Info

System Info bitsandbytes 0.43.1 Python 3.10.12 "CUDA" library: rocm-libs Version: 6.0.0.60000-91~22.04 Ubuntu 22.04.1

Getting the following error after Mistral safetensors are downloaded:

Could not load bitsandbytes native library: 'NoneType' object has no attribute 'split' Traceback (most recent call last): File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 109, in lib = get_native_library() File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 88, in get_native_library cuda_specs = get_cuda_specs() File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 39, in get_cuda_specs cuda_version_string=(get_cuda_version_string()), File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 29, in get_cuda_version_string major, minor = get_cuda_version_tuple() File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 24, in get_cuda_version_tuple major, minor = map(int, torch.version.cuda.split(".")) AttributeError: 'NoneType' object has no attribute 'split'

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++ Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/main.py", line 4, in main() File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 51, in main cuda_specs = get_cuda_specs() File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 39, in get_cuda_specs cuda_version_string=(get_cuda_version_string()), File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 29, in get_cuda_version_string major, minor = get_cuda_version_tuple() File "/home/user/Documents/python-env/jupyter-default/jupyter-default/lib/python3.10/site-packages/bitsandbytes/cuda_specs.py", line 24, in get_cuda_version_tuple major, minor = map(int, torch.version.cuda.split(".")) AttributeError: 'NoneType' object has no attribute 'split'

Reproduction

It always replicates with this code:

from transformers import BitsAndBytesConfig, AutoModelForCausalLM, AutoTokenizer, GenerationConfig MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.1"

quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, )

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True) tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, torch_dtype=torch.float16, trust_remote_code=True, device_map="auto", quantization_config=quantization_config )

generation_config = GenerationConfig.from_pretrained(MODEL_NAME) generation_config.max_new_tokens = 1024 # maximum number of new tokens that can be generated by the model generation_config.temperature = 0.7 # randomness of the generated tex generation_config.top_p = 0.95 # diversity of the generated text generation_config.do_sample = True # sampling during the generation process generation_config.repetition_penalty = 1.15 # the degree to which the model should avoid repeating tokens in the generated text

pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, return_full_text=True, generation_config=generation_config, )

Expected behavior

To be able to load the model and initialize pipeline.

bitsandbytes-foundation / bitsandbytes