punica-ai / punica

Serving multiple LoRA finetuned LLM as one
https://arxiv.org/abs/2310.18547
Apache License 2.0
883 stars 40 forks source link

Error when installing package from source #52

Open Conless opened 1 month ago

Conless commented 1 month ago

My environment is

PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Arch Linux (x86_64)
GCC version: (conda-forge gcc 12.3.0-7) 12.3.0
Clang version: 17.0.6
CMake version: version 3.29.3
Libc version: glibc-2.39

Python version: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] (64-bit runtime)
Python platform: Linux-6.9.1-arch1-1-x86_64-with-glibc2.39
Is CUDA available: True
CUDA runtime version: 12.1.66
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4070 Ti SUPER
Nvidia driver version: 550.78
cuDNN version: Probably one of the following:
/usr/lib/libcudnn.so.8.9.7
/usr/lib/libcudnn_adv_infer.so.8.9.7
/usr/lib/libcudnn_adv_train.so.8.9.7
/usr/lib/libcudnn_cnn_infer.so.8.9.7
/usr/lib/libcudnn_cnn_train.so.8.9.7
/usr/lib/libcudnn_ops_infer.so.8.9.7
/usr/lib/libcudnn_ops_train.so.8.9.7
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] torch==2.3.0
[pip3] triton==2.3.0
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] nvidia-nccl-cu12          2.20.5                   pypi_0    pypi
[conda] torch                     2.3.0                    pypi_0    pypi
[conda] triton                    2.3.0                    pypi_0    pypiROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: N/A
vLLM Build Flags:
CUDA Archs: 8.0; ROCm: Disabled; Neuron: Disabled
GPU Topology:
GPU0    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      0-23    0               N/A

and I got

FAILED: /tmp/tmp3dq5g304.build-temp/csrc/sgmv_flashinfer/sgmv_all.o
~/.conda/envs/punica/bin/nvcc --generate-dependencies-with-compile --dependency-output /tmp/tmp3dq5g304.build-temp/csrc/sgmv_flashinfer/sgmv_all.o.d -I/workspace/punica/third_party/cutlass/include -punica/third_party/flashinfer/include -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include/TH -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include/THC -I~/.conda/envs/punica/include -I~/.conda/envs/punica/include/python3.11 -c -c /workspace/punica/csrc/sgmv_flashinfer/sgmv_all.cu -o /tmp/tmp3dq5g304.build-temp/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=sm_80 -ccbin ~/.conda/envs/punica/bin/gcc -std=c++17
  ~/.conda/envs/punica/include/cuda/std/barrier(144): error: no instance of overloaded "operator new" matches the argument list
              argument types are: (unsigned long, cuda::std::__4::__barrier_base<cuda::std::__4::__empty_completion, 2> *)
           new (&__b->__barrier) __barrier_base(__expected);
           ^

  1 error detected in the compilation of "/workspace/punica/csrc/sgmv_flashinfer/sgmv_all.cu".

What may be the reasons? Thank you for your help!