Closed robotzheng closed 3 weeks ago
Hi, this error is from torch, which seems to be an environment problem. CUDA 12.x requires the driver version >= 525.60.13 (cuda compatibility). You may have a trial to upgrade the driver version.
If it still reports such problem, you can try to downgrade the torch version to 2.1.2 by pip install torch==2.1.2
as suggested in this issue.
You can also try the docker image nvcr.io/nvidia/pytorch:23.01-py3
.
Closed as the question has been answered.
NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 12.3
python 3.10.14 ubuntu 22.04
pip install typing_extensions-4.11.0-py3-none-any.whl pip install bitblas-0.0.1.dev5-py3-none-manylinux1_x86_64.whl
python -c "import bitblas; print(bitblas.version)"", line 1, in
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/init.py", line 19, in
from . import gpu # noqa: F401
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/gpu/init.py", line 7, in
from .fallback import Fallback # noqa: F401
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/gpu/fallback.py", line 28, in
from ..base import normalize_prim_func, try_inline
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/base/init.py", line 16, in
from .transform import ApplyDefaultSchedule, ApplyFastTuning
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/base/transform.py", line 20, in
from .utils import fast_tune, fast_tune_with_dynamic_range
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/base/utils.py", line 22, in
from bitblas.utils import tensor_replace_dp4a, tensor_remove_make_int4
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/utils/init.py", line 4, in
from .tensor_adapter import tvm_tensor_to_torch, lazy_tvm_tensor_to_torch, lazy_torch_to_tvm_tensor # noqa: F401
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/bitblas/utils/tensor_adapter.py", line 7, in
import torch
File "/opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/torch/init.py", line 237, in
from torch._C import * # noqa: F403
ImportError: /opt/conda/envs/EN_BITNET/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so: undefined symbol: ncclCommRegister
Traceback (most recent call last): File "