Closed jkl375 closed 7 months ago
Can you provide the command that you run and the full output?
Can you provide the command that you run and the full output? At first, there is a ninja error. I use NVIDIA's Pytorch image nvcr.io/nvidia/Pytorch: 23.09-py3
root@ba66182d6ed2:/workspace# pip install -v --no-build-isolation . Using pip 23.2.1 from /usr/local/lib/python3.10/dist-packages/pip (python 3.10) Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Processing /workspace Running command Preparing metadata (pyproject.toml) running dist_info creating /tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info writing /tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/PKG-INFO writing dependency_links to /tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/dependency_links.txt writing requirements to /tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/requires.txt writing top-level names to /tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/top_level.txt writing manifest file '/tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/SOURCES.txt' reading manifest file '/tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' no previously-included directories found matching 'benchmarks' no previously-included directories found matching '*/__pycache__' warning: no previously-included files matching '*.so' found anywhere in distribution writing manifest file '/tmp/pip-modern-metadata-mcwxj_ha/punica.egg-info/SOURCES.txt' creating '/tmp/pip-modern-metadata-mcwxj_ha/punica-0.0.1.dist-info' Preparing metadata (pyproject.toml) ... done Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from punica==0.0.1) (2.1.0a0+32f93b1) Collecting transformers (from punica==0.0.1) Obtaining dependency information for transformers from https://files.pythonhosted.org/packages/9a/06/e4ec2a321e57c03b7e9345d709d554a52c33760e5015fdff0919d9459af0/transformers-4.35.0-py3-none-any.whl.metadata Downloading transformers-4.35.0-py3-none-any.whl.metadata (123 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.1/123.1 kB 669.4 kB/s eta 0:00:00 Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from punica==0.0.1) (1.22.2) Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (3.12.4) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (4.7.1) Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (1.12) Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (2.6.3) Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (3.1.2) Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (2023.6.0) Collecting huggingface-hub<1.0,>=0.16.4 (from transformers->punica==0.0.1) Obtaining dependency information for huggingface-hub<1.0,>=0.16.4 from https://files.pythonhosted.org/packages/65/cc/2891260847777eb9aaca278aaf3f846c9ff8ea1351643a4f33ff26d5d213/huggingface_hub-0.19.1-py3-none-any.whl.metadata Downloading huggingface_hub-0.19.1-py3-none-any.whl.metadata (13 kB) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (23.1) Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (6.0.1) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (2023.8.8) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (2.31.0) Collecting tokenizers<0.15,>=0.14 (from transformers->punica==0.0.1) Obtaining dependency information for tokenizers<0.15,>=0.14 from https://files.pythonhosted.org/packages/a7/7b/c1f643eb086b6c5c33eef0c3752e37624bd23e4cbc9f1332748f1c6252d1/tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata Downloading tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB) Collecting safetensors>=0.3.1 (from transformers->punica==0.0.1) Obtaining dependency information for safetensors>=0.3.1 from https://files.pythonhosted.org/packages/20/4e/878b080dbda92666233ec6f316a53969edcb58eab1aa399a64d0521cf953/safetensors-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata Downloading safetensors-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (4.66.1) Collecting huggingface-hub<1.0,>=0.16.4 (from transformers->punica==0.0.1) Obtaining dependency information for huggingface-hub<1.0,>=0.16.4 from https://files.pythonhosted.org/packages/aa/f3/3fc97336a0e90516901befd4f500f08d691034d387406fdbde85bea827cc/huggingface_hub-0.17.3-py3-none-any.whl.metadata Downloading huggingface_hub-0.17.3-py3-none-any.whl.metadata (13 kB) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->punica==0.0.1) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (3.2.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (1.26.16) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (2023.7.22) Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->punica==0.0.1) (1.3.0) Downloading transformers-4.35.0-py3-none-any.whl (7.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.9/7.9 MB 16.4 MB/s eta 0:00:00 Downloading safetensors-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 151.6 MB/s eta 0:00:00 Downloading tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 107.4 MB/s eta 0:00:00 Downloading huggingface_hub-0.17.3-py3-none-any.whl (295 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 126.4 MB/s eta 0:00:00 Building wheels for collected packages: punica Running command Building wheel for punica (pyproject.toml) running bdist_wheel running build running build_py running egg_info writing punica.egg-info/PKG-INFO writing dependency_links to punica.egg-info/dependency_links.txt writing requirements to punica.egg-info/requires.txt writing top-level names to punica.egg-info/top_level.txt reading manifest file 'punica.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' no previously-included directories found matching 'benchmarks' no previously-included directories found matching '*/__pycache__' warning: no previously-included files matching '*.so' found anywhere in distribution writing manifest file 'punica.egg-info/SOURCES.txt' running build_ext building 'punica.ops._kernels' extension Emitting ninja build file /workspace/build/temp.linux-x86_64-3.10/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/sgmv_flashinfer/sgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/sgmv_flashinfer/sgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 In file included from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/sgmv_flashinfer/permuted_smem.cuh:23, from /workspace/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh:6, from /workspace/csrc/sgmv_flashinfer/sgmv_all.cu:8: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 15 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/atomic:727, from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:60, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/sgmv_flashinfer/permuted_smem.cuh:23, from /workspace/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh:6, from /workspace/csrc/sgmv_flashinfer/sgmv_all.cu:8: /usr/local/cuda/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows." 12 | # error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:487, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/sgmv_flashinfer/permuted_smem.cuh:23, from /workspace/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh:6, from /workspace/csrc/sgmv_flashinfer/sgmv_all.cu:8: /usr/local/cuda/include/cuda/std/detail/libcxx/include/__cuda/barrier.h:19:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 19 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ [2/3] /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/bgmv/bgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /workspace/build/temp.linux-x86_64-3.10/csrc/bgmv/bgmv_all.o /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/bgmv/bgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 In file included from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/bgmv/bgmv_impl.cuh:6, from /workspace/csrc/bgmv/bgmv_all.cu:2: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 15 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/atomic:727, from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:60, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/bgmv/bgmv_impl.cuh:6, from /workspace/csrc/bgmv/bgmv_all.cu:2: /usr/local/cuda/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows." 12 | # error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:487, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/bgmv/bgmv_impl.cuh:6, from /workspace/csrc/bgmv/bgmv_all.cu:2: /usr/local/cuda/include/cuda/std/detail/libcxx/include/__cuda/barrier.h:19:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 19 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ [3/3] /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/flashinfer_adapter/flashinfer_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/flashinfer_adapter/flashinfer_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /workspace/build/temp.linux-x86_64-3.10/csrc/flashinfer_adapter/flashinfer_all.o /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/flashinfer_adapter/flashinfer_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/flashinfer_adapter/flashinfer_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 In file included from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/flashinfer_adapter/../flashinfer/decode.cuh:24, from /workspace/csrc/flashinfer_adapter/flashinfer_all.cu:5: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 15 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/atomic:727, from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:60, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/flashinfer_adapter/../flashinfer/decode.cuh:24, from /workspace/csrc/flashinfer_adapter/flashinfer_all.cu:5: /usr/local/cuda/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows." 12 | # error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:487, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/flashinfer_adapter/../flashinfer/decode.cuh:24, from /workspace/csrc/flashinfer_adapter/flashinfer_all.cu:5: /usr/local/cuda/include/cuda/std/detail/libcxx/include/__cuda/barrier.h:19:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 19 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1917, in _run_ninja_build subprocess.run( File "/usr/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
× Building wheel for punica (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. full command: /usr/bin/python /usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmp1q3xdwn0 cwd: /workspace Building wheel for punica (pyproject.toml) ... error ERROR: Failed building wheel for punica Failed to build punica ERROR: Could not build wheels for punica, which is required to install pyproject.toml-based projects
[notice] A new release of pip is available: 23.2.1 -> 23.3.1 [notice] To update, run: python -m pip install --upgrade pip root@ba66182d6ed2:/workspace# pip install -v --no-build-isolation . Using pip 23.2.1 from /usr/local/lib/python3.10/dist-packages/pip (python 3.10) Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Processing /workspace Running command Preparing metadata (pyproject.toml) running dist_info creating /tmp/pip-modern-metadata-vhp_fir1/punica.egg-info writing /tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/PKG-INFO writing dependency_links to /tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/dependency_links.txt writing requirements to /tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/requires.txt writing top-level names to /tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/top_level.txt writing manifest file '/tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/SOURCES.txt' reading manifest file '/tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' no previously-included directories found matching 'benchmarks' no previously-included directories found matching '/pycache' warning: no previously-included files matching '.so' found anywhere in distribution writing manifest file '/tmp/pip-modern-metadata-vhp_fir1/punica.egg-info/SOURCES.txt' creating '/tmp/pip-modern-metadata-vhp_fir1/punica-0.0.1.dist-info' Preparing metadata (pyproject.toml) ... done Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from punica==0.0.1) (2.1.0a0+32f93b1) Collecting transformers (from punica==0.0.1) Obtaining dependency information for transformers from https://files.pythonhosted.org/packages/9a/06/e4ec2a321e57c03b7e9345d709d554a52c33760e5015fdff0919d9459af0/transformers-4.35.0-py3-none-any.whl.metadata Downloading transformers-4.35.0-py3-none-any.whl.metadata (123 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.1/123.1 kB 314.5 kB/s eta 0:00:00 Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from punica==0.0.1) (1.22.2) Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (3.12.4) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (4.7.1) Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (1.12) Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (2.6.3) Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (3.1.2) Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->punica==0.0.1) (2023.6.0) Collecting huggingface-hub<1.0,>=0.16.4 (from transformers->punica==0.0.1) Obtaining dependency information for huggingface-hub<1.0,>=0.16.4 from https://files.pythonhosted.org/packages/65/cc/2891260847777eb9aaca278aaf3f846c9ff8ea1351643a4f33ff26d5d213/huggingface_hub-0.19.1-py3-none-any.whl.metadata Downloading huggingface_hub-0.19.1-py3-none-any.whl.metadata (13 kB) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (23.1) Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (6.0.1) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (2023.8.8) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (2.31.0) Collecting tokenizers<0.15,>=0.14 (from transformers->punica==0.0.1) Obtaining dependency information for tokenizers<0.15,>=0.14 from https://files.pythonhosted.org/packages/a7/7b/c1f643eb086b6c5c33eef0c3752e37624bd23e4cbc9f1332748f1c6252d1/tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata Downloading tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB) Collecting safetensors>=0.3.1 (from transformers->punica==0.0.1) Obtaining dependency information for safetensors>=0.3.1 from https://files.pythonhosted.org/packages/20/4e/878b080dbda92666233ec6f316a53969edcb58eab1aa399a64d0521cf953/safetensors-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata Downloading safetensors-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers->punica==0.0.1) (4.66.1) Collecting huggingface-hub<1.0,>=0.16.4 (from transformers->punica==0.0.1) Obtaining dependency information for huggingface-hub<1.0,>=0.16.4 from https://files.pythonhosted.org/packages/aa/f3/3fc97336a0e90516901befd4f500f08d691034d387406fdbde85bea827cc/huggingface_hub-0.17.3-py3-none-any.whl.metadata Downloading huggingface_hub-0.17.3-py3-none-any.whl.metadata (13 kB) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->punica==0.0.1) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (3.2.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (1.26.16) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers->punica==0.0.1) (2023.7.22) Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->punica==0.0.1) (1.3.0) Downloading transformers-4.35.0-py3-none-any.whl (7.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.9/7.9 MB 7.5 MB/s eta 0:00:00 Downloading safetensors-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 95.0 MB/s eta 0:00:00 Downloading tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 93.6 MB/s eta 0:00:00 Downloading huggingface_hub-0.17.3-py3-none-any.whl (295 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 146.0 MB/s eta 0:00:00 Building wheels for collected packages: punica Running command Building wheel for punica (pyproject.toml) running bdist_wheel running build running build_py running egg_info writing punica.egg-info/PKG-INFO writing dependency_links to punica.egg-info/dependency_links.txt writing requirements to punica.egg-info/requires.txt writing top-level names to punica.egg-info/top_level.txt reading manifest file 'punica.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' no previously-included directories found matching 'benchmarks' no previously-included directories found matching '/pycache' warning: no previously-included files matching '.so' found anywhere in distribution writing manifest file 'punica.egg-info/SOURCES.txt' running build_ext building 'punica.ops._kernels' extension Emitting ninja build file /workspace/build/temp.linux-x86_64-3.10/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/sgmv_flashinfer/sgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/sgmv_flashinfer/sgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 In file included from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/sgmv_flashinfer/permuted_smem.cuh:23, from /workspace/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh:6, from /workspace/csrc/sgmv_flashinfer/sgmv_all.cu:8: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 15 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/atomic:727, from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:60, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/sgmv_flashinfer/permuted_smem.cuh:23, from /workspace/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh:6, from /workspace/csrc/sgmv_flashinfer/sgmv_all.cu:8: /usr/local/cuda/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on nix and sm_70 and up on Windows." 12 | # error "CUDA atomics are only supported for sm_60 and up on nix and sm_70 and up on Windows." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:487, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/sgmv_flashinfer/permuted_smem.cuh:23, from /workspace/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh:6, from /workspace/csrc/sgmv_flashinfer/sgmv_all.cu:8: /usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/barrier.h:19:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 19 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ [2/3] /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/bgmv/bgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /workspace/build/temp.linux-x86_64-3.10/csrc/bgmv/bgmv_all.o /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/bgmv/bgmv_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 In file included from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/bgmv/bgmv_impl.cuh:6, from /workspace/csrc/bgmv/bgmv_all.cu:2: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 15 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/atomic:727, from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:60, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/bgmv/bgmv_impl.cuh:6, from /workspace/csrc/bgmv/bgmv_all.cu:2: /usr/local/cuda/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on nix and sm_70 and up on Windows." 12 | # error "CUDA atomics are only supported for sm_60 and up on nix and sm_70 and up on Windows." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:487, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/bgmv/bgmv_impl.cuh:6, from /workspace/csrc/bgmv/bgmv_all.cu:2: /usr/local/cuda/include/cuda/std/detail/libcxx/include/__cuda/barrier.h:19:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 19 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ [3/3] /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/flashinfer_adapter/flashinfer_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/flashinfer_adapter/flashinfer_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /workspace/build/temp.linux-x86_64-3.10/csrc/flashinfer_adapter/flashinfer_all.o /usr/local/cuda/bin/nvcc -I/workspace/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /workspace/csrc/flashinfer_adapter/flashinfer_all.cu -o /workspace/build/temp.linux-x86_64-3.10/csrc/flashinfer_adapter/flashinfer_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_72,code=sm_72 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_87,code=sm_87 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 In file included from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/flashinfer_adapter/../flashinfer/decode.cuh:24, from /workspace/csrc/flashinfer_adapter/flashinfer_all.cu:5: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 15 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/atomic:727, from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:60, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/flashinfer_adapter/../flashinfer/decode.cuh:24, from /workspace/csrc/flashinfer_adapter/flashinfer_all.cu:5: /usr/local/cuda/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on nix and sm_70 and up on Windows." 12 | # error "CUDA atomics are only supported for sm_60 and up on nix and sm_70 and up on Windows." | ^~~~~ In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:487, from /usr/local/cuda/include/cuda/std/barrier:22, from /usr/local/cuda/include/cuda/barrier:14, from /usr/local/cuda/include/cuda/pipeline:56, from /workspace/csrc/flashinfer_adapter/../flashinfer/decode.cuh:24, from /workspace/csrc/flashinfer_adapter/flashinfer_all.cu:5: /usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/barrier.h:19:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up." 19 | # error "CUDA synchronization primitives are only supported for sm_70 and up." | ^~~~~ ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1917, in _run_ninja_build subprocess.run( File "/usr/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
× Building wheel for punica (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. full command: /usr/bin/python /usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpr95z1doq cwd: /workspace Building wheel for punica (pyproject.toml) ... error ERROR: Failed building wheel for punica Failed to build punica ERROR: Could not build wheels for punica, which is required to install pyproject.toml-based projects
Later, I change command=['ninja ',' - v '] to command=['ninja', '-- version'] in pytorch's utils/cpp_ Extension.py. But it still does not work.
Can you provide information about your GPU? Seems the CUDA architecture is too old (< sm_70).
From your log:
[1/3] /usr/local/cuda/bin/nvcc \
-I/workspace/third_party/cutlass/include \
-I/usr/local/lib/python3.10/dist-packages/torch/include \
-I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include \
-I/usr/local/lib/python3.10/dist-packages/torch/include/TH \
-I/usr/local/lib/python3.10/dist-packages/torch/include/THC \
-I/usr/local/cuda/include \
-I/usr/include/python3.10 \
-c \
-c /workspace/csrc/sgmv_flashinfer/sgmv_all.cu \
-o /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o \
--expt-relaxed-constexpr \
--compiler-options ''"'"'-fPIC'"'"'' \
-DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' \
-DTORCH_EXTENSION_NAME=_kernels \
-D_GLIBCXX_USE_CXX11_ABI=1 \
-gencode=arch=compute_52,code=sm_52 \
-gencode=arch=compute_60,code=sm_60 \
-gencode=arch=compute_61,code=sm_61 \
-gencode=arch=compute_70,code=sm_70 \
-gencode=arch=compute_72,code=sm_72 \
-gencode=arch=compute_75,code=sm_75 \
-gencode=arch=compute_80,code=sm_80 \
-gencode=arch=compute_86,code=sm_86 \
-gencode=arch=compute_87,code=sm_87 \
-gencode=arch=compute_90,code=compute_90 \
-gencode=arch=compute_90,code=sm_90 \
-std=c++17
Looks like that your container has set some settings that by default compiles for every architecture. Can you try to override with:
env TORCH_CUDA_ARCH_LIST="8.0" pip install -v --no-build-isolation .
Later, I change command=['ninja ',' - v '] to command=['ninja', '-- version'] in pytorch's utils/cpp_ Extension.py. But it still does not work.
Please revert.
From your log:
[1/3] /usr/local/cuda/bin/nvcc \ -I/workspace/third_party/cutlass/include \ -I/usr/local/lib/python3.10/dist-packages/torch/include \ -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include \ -I/usr/local/lib/python3.10/dist-packages/torch/include/TH \ -I/usr/local/lib/python3.10/dist-packages/torch/include/THC \ -I/usr/local/cuda/include \ -I/usr/include/python3.10 \ -c \ -c /workspace/csrc/sgmv_flashinfer/sgmv_all.cu \ -o /workspace/build/temp.linux-x86_64-3.10/csrc/sgmv_flashinfer/sgmv_all.o \ --expt-relaxed-constexpr \ --compiler-options ''"'"'-fPIC'"'"'' \ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' \ -DTORCH_EXTENSION_NAME=_kernels \ -D_GLIBCXX_USE_CXX11_ABI=1 \ -gencode=arch=compute_52,code=sm_52 \ -gencode=arch=compute_60,code=sm_60 \ -gencode=arch=compute_61,code=sm_61 \ -gencode=arch=compute_70,code=sm_70 \ -gencode=arch=compute_72,code=sm_72 \ -gencode=arch=compute_75,code=sm_75 \ -gencode=arch=compute_80,code=sm_80 \ -gencode=arch=compute_86,code=sm_86 \ -gencode=arch=compute_87,code=sm_87 \ -gencode=arch=compute_90,code=compute_90 \ -gencode=arch=compute_90,code=sm_90 \ -std=c++17
Looks like that your container has set some settings that by default compiles for every architecture. Can you try to override with:
env TORCH_CUDA_ARCH_LIST="8.0" pip install -v --no-build-isolation .
Later, I change command=['ninja ',' - v '] to command=['ninja', '-- version'] in pytorch's utils/cpp_ Extension.py. But it still does not work.
Please revert.
It works! Thank you very much.
Can you provide information about your GPU? Seems the CUDA architecture is too old (< sm_70).
My GPU is A100, and it seems that the env TORCH_CUDA_ARCH_LIST is not set correctly in the container. The problem is resolved now. Thank you very much!
Glad that it works!
When I install the package, the following error occurred![image](https://github.com/punica-ai/punica/assets/54523287/8bb0e301-d563-41f6-9e33-2f531aebaa63)