microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
190 stars 21 forks source link

undefined symbol: ncclCommRegister #39

Closed robotzheng closed 3 weeks ago

robotzheng commented 1 month ago

NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 12.3
python 3.10.14 ubuntu 22.04

pip install typing_extensions-4.11.0-py3-none-any.whl pip install bitblas-0.0.1.dev5-py3-none-manylinux1_x86_64.whl

python -c "import bitblas; print(bitblas.version)"
Traceback (most recent call last): File "", line 1, in File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/init.py", line 19, in from . import gpu # noqa: F401 File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/gpu/init.py", line 7, in from .fallback import Fallback # noqa: F401 File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/gpu/fallback.py", line 28, in from ..base import normalize_prim_func, try_inline File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/base/init.py", line 16, in from .transform import ApplyDefaultSchedule, ApplyFastTuning File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/base/transform.py", line 20, in from .utils import fast_tune, fast_tune_with_dynamic_range File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/base/utils.py", line 22, in from bitblas.utils import tensor_replace_dp4a, tensor_remove_make_int4 File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/utils/init.py", line 4, in from .tensor_adapter import tvm_tensor_to_torch, lazy_tvm_tensor_to_torch, lazy_torch_to_tvm_tensor # noqa: F401 File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/utils/tensor_adapter.py", line 7, in import torch File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/torch/init.py", line 237, in from torch._C import * # noqa: F403 ImportError: /opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so: undefined symbol: ncclCommRegister

xysmlx commented 1 month ago

Hi, this error is from torch, which seems to be an environment problem. CUDA 12.x requires the driver version >= 525.60.13 (cuda compatibility). You may have a trial to upgrade the driver version.

If it still reports such problem, you can try to downgrade the torch version to 2.1.2 by pip install torch==2.1.2 as suggested in this issue.

You can also try the docker image nvcr.io/nvidia/pytorch:23.01-py3.

LeiWang1999 commented 3 weeks ago

Closed as the question has been answered.