ROCm / bitsandbytes

8-bit CUDA functions for PyTorch
MIT License
31 stars 3 forks source link

error: cannot load libbitsandbytes_cpu.so, but I have libbitsandbytes_hip.so #35

Open guangzlu opened 1 month ago

guangzlu commented 1 month ago

System Info

Docker image: rocm6.1_ubuntu22.04_py3.10_pytorch_2.4 Rocm6.1.0 Pytorch2.4 GPU: MI250

Reproduction

Install method: git clone --recurse https://github.com/ROCm/bitsandbytes cd bitsandbytes git checkout rocm_enabled pip install -r requirements-dev.txt cmake -DCOMPUTE_BACKEND=hip -S . #Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch make pip install .

python script: import bitsandbytes

Expected behavior

I am following this blog https://rocm.blogs.amd.com/artificial-intelligence/llama2-lora/README.html to do finetune on MI250. But after I installed bitsandbytes from source code and run the python script, it turned out the error: image It told that it cannot find libbitsandbytes_cpu.so: OSError: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory. But when I moved into /opt/conda/envs/py_3.10/lib/python3.10/site-packages/bitsandbytes, I found I have libbitsandbytes_hip.so image Is it using the wrong .so file? And how to fix this?

pnunna93 commented 4 weeks ago

Seems like the torch version in the docker is for rocm6.0. Please reinstall 6.1 torch using this command and install bitsandbytes again.

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.1/