punica-ai / punica

Serving multiple LoRA finetuned LLM as one
https://arxiv.org/abs/2310.18547
Apache License 2.0
883 stars 40 forks source link

pip install failed with cuda 12.2 #46

Closed hayleyhu closed 3 months ago

hayleyhu commented 3 months ago

Following the instruction for BUILD FROM SOURCE

gcc --version 9.4.0

/usr/local/cuda/bin/nvcc --version Built on Tue_Aug_15_22:02:13_PDT_2023 Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0

ninja --version 1.11.1.git.kitware.jobserver-1

and

import torch print(torch.version) 2.2.1+cu121

~/punica master ?2 ❯ pip install -v --no-build-isolation .                 Py venv310 hayley@compute-nv535-node-67 01:22:17 AM

Using pip 23.0.1 from /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/pip (python 3.10)
Processing /home/hayley/punica
  Running command Preparing metadata (pyproject.toml)
  /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/config/pyprojecttoml.py:108: _BetaConfiguration: Support for `[tool.setuptools]` in `pyproject.toml` is still *beta*.
    warnings.warn(msg, _BetaConfiguration)
  running dist_info
  creating /tmp/pip-modern-metadata-_jux6hqk/punica.egg-info
  writing /tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/requires.txt
  writing top-level names to /tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/top_level.txt
  writing manifest file '/tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/SOURCES.txt'
  reading manifest template 'MANIFEST.in'
  no previously-included directories found matching 'benchmarks'
  no previously-included directories found matching '*/__pycache__'
  warning: no previously-included files matching '*.so' found anywhere in distribution
  adding license file 'LICENSE'
  writing manifest file '/tmp/pip-modern-metadata-_jux6hqk/punica.egg-info/SOURCES.txt'
  creating '/tmp/pip-modern-metadata-_jux6hqk/punica-1.1.0+c119.d20240308.591b598.dist-info'
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: torch in ./.venvs/venv310/lib/python3.10/site-packages (from punica==1.1.0+c119.d20240308.591b598) (2.2.1)
Collecting transformers
  Downloading transformers-4.38.2-py3-none-any.whl (8.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.5/8.5 MB 70.5 MB/s eta 0:00:00
Requirement already satisfied: numpy in ./.venvs/venv310/lib/python3.10/site-packages (from punica==1.1.0+c119.d20240308.591b598) (1.26.4)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (12.1.3.1)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (12.1.0.106)
Requirement already satisfied: networkx in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (3.2.1)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (12.1.105)
Requirement already satisfied: jinja2 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (3.1.3)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (12.1.105)
Requirement already satisfied: filelock in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (3.13.1)
Requirement already satisfied: sympy in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (1.12)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (8.9.2.26)
Requirement already satisfied: triton==2.2.0 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (2.2.0)
Requirement already satisfied: nvidia-nccl-cu12==2.19.3 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (2.19.3)
Requirement already satisfied: typing-extensions>=4.8.0 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (4.10.0)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (10.3.2.106)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (11.0.2.54)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (11.4.5.107)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (12.1.105)
Requirement already satisfied: fsspec in ./.venvs/venv310/lib/python3.10/site-packages (from torch->punica==1.1.0+c119.d20240308.591b598) (2024.2.0)
Requirement already satisfied: nvidia-nvjitlink-cu12 in ./.venvs/venv310/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch->punica==1.1.0+c119.d20240308.591b598) (12.4.99)
Collecting tokenizers<0.19,>=0.14
  Downloading tokenizers-0.15.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 46.7 MB/s eta 0:00:00
Collecting tqdm>=4.27
  Using cached tqdm-4.66.2-py3-none-any.whl (78 kB)
Collecting requests
  Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Collecting safetensors>=0.4.1
  Downloading safetensors-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 26.3 MB/s eta 0:00:00
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 705.5/705.5 kB 16.2 MB/s eta 0:00:00
Collecting regex!=2019.12.17
  Downloading regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 774.0/774.0 kB 18.2 MB/s eta 0:00:00
Collecting huggingface-hub<1.0,>=0.19.3
  Downloading huggingface_hub-0.21.4-py3-none-any.whl (346 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 346.4/346.4 kB 11.4 MB/s eta 0:00:00
Collecting packaging>=20.0
  Using cached packaging-23.2-py3-none-any.whl (53 kB)
Requirement already satisfied: MarkupSafe>=2.0 in ./.venvs/venv310/lib/python3.10/site-packages (from jinja2->torch->punica==1.1.0+c119.d20240308.591b598) (2.1.5)
Collecting certifi>=2017.4.17
  Using cached certifi-2024.2.2-py3-none-any.whl (163 kB)
Collecting charset-normalizer<4,>=2
  Downloading charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.1/142.1 kB 4.4 MB/s eta 0:00:00
Collecting urllib3<3,>=1.21.1
  Using cached urllib3-2.2.1-py3-none-any.whl (121 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.6-py3-none-any.whl (61 kB)
Requirement already satisfied: mpmath>=0.19 in ./.venvs/venv310/lib/python3.10/site-packages (from sympy->torch->punica==1.1.0+c119.d20240308.591b598) (1.3.0)
Building wheels for collected packages: punica
  Running command Building wheel for punica (pyproject.toml)
  /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/config/pyprojecttoml.py:108: _BetaConfiguration: Support for `[tool.setuptools]` in `pyproject.toml` is still *beta*.
    warnings.warn(msg, _BetaConfiguration)
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-cpython-310
  creating build/lib.linux-x86_64-cpython-310/punica
  copying src/punica/__init__.py -> build/lib.linux-x86_64-cpython-310/punica
  copying src/punica/_build_meta.py -> build/lib.linux-x86_64-cpython-310/punica
  creating build/lib.linux-x86_64-cpython-310/punica/models
  copying src/punica/models/__init__.py -> build/lib.linux-x86_64-cpython-310/punica/models
  copying src/punica/models/llama.py -> build/lib.linux-x86_64-cpython-310/punica/models
  copying src/punica/models/llama_lora.py -> build/lib.linux-x86_64-cpython-310/punica/models
  creating build/lib.linux-x86_64-cpython-310/punica/ops
  copying src/punica/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/punica/ops
  creating build/lib.linux-x86_64-cpython-310/punica/utils
  copying src/punica/utils/__init__.py -> build/lib.linux-x86_64-cpython-310/punica/utils
  copying src/punica/utils/cat_tensor.py -> build/lib.linux-x86_64-cpython-310/punica/utils
  copying src/punica/utils/convert_lora_weight.py -> build/lib.linux-x86_64-cpython-310/punica/utils
  copying src/punica/utils/kvcache.py -> build/lib.linux-x86_64-cpython-310/punica/utils
  copying src/punica/utils/lora.py -> build/lib.linux-x86_64-cpython-310/punica/utils
  running egg_info
  creating src/punica.egg-info
  writing src/punica.egg-info/PKG-INFO
  writing dependency_links to src/punica.egg-info/dependency_links.txt
  writing requirements to src/punica.egg-info/requires.txt
  writing top-level names to src/punica.egg-info/top_level.txt
  writing manifest file 'src/punica.egg-info/SOURCES.txt'
  reading manifest file 'src/punica.egg-info/SOURCES.txt'
  reading manifest template 'MANIFEST.in'
  no previously-included directories found matching 'benchmarks'
  no previously-included directories found matching '*/__pycache__'
  warning: no previously-included files matching '*.so' found anywhere in distribution
  adding license file 'LICENSE'
  writing manifest file 'src/punica.egg-info/SOURCES.txt'
  running build_ext
  /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py:415: UserWarning: The detected CUDA version (12.2) has a minor version mismatch with the version that was used to compile PyTorch (12.1). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py:425: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 12.2
    warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
  building 'punica.ops._kernels' extension
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/rms_norm
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv
  creating /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv_flashinfer
  Emitting ninja build file /home/hayley/punica/build/temp.linux-x86_64-cpython-310/build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  [1/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/rms_norm/rms_norm_cutlass.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/rms_norm/rms_norm_cutlass.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/rms_norm/rms_norm_cutlass.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [2/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/flashinfer_all.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/flashinfer_all.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/flashinfer_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [3/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g1_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g1_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g1_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [4/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g1_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g1_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g1_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [5/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g2_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g2_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g2_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [6/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g2_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g2_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g2_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [7/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g4_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g4_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g4_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [8/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g4_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g4_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g4_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [9/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g8_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g8_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g8_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [10/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g8_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_decode_p16_g8_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_decode_p16_g8_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [11/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g4_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g4_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g4_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [12/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g8_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g8_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g8_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [13/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g4_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g4_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g4_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [14/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g8_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g8_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g8_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [15/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g2_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g2_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g2_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [16/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g1_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g1_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g1_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [17/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g1_h128_bf16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g1_h128_bf16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g1_h128_bf16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [18/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g2_h128_fp16.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/flashinfer_adapter/generated/batch_prefill_p16_g2_h128_fp16.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter/generated/batch_prefill_p16_g2_h128_fp16.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [19/22] c++ -MMD -MF /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/punica_ops.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/punica_ops.cc -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/punica_ops.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
  FAILED: /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/punica_ops.o
  c++ -MMD -MF /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/punica_ops.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/punica_ops.cc -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/punica_ops.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
  In file included from /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/Device.h:4,
                   from /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                   from /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/extension.h:9,
                   from /home/hayley/punica/csrc/punica_ops.cc:4:
  /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
     12 | #include <Python.h>
        |          ^~~~~~~~~~
  compilation terminated.
  [20/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv_flashinfer/sgmv_all.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/sgmv_flashinfer/sgmv_all.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [21/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv/sgmv_cutlass.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/sgmv/sgmv_cutlass.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv/sgmv_cutlass.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  [22/22] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv/bgmv_all.o.d -I/home/hayley/punica/third_party/cutlass/include -I/home/hayley/punica/third_party/flashinfer/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/TH -I/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/hayley/punica/.venvs/venv310/include -I/usr/include/python3.10 -c -c /home/hayley/punica/csrc/bgmv/bgmv_all.cu -o /home/hayley/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -std=c++17
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2096, in _run_ninja_build
      subprocess.run(
    File "/usr/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
      return _build_backend().build_wheel(wheel_directory, config_settings,
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/build_meta.py", line 412, in build_wheel
      return self._build_with_temp_dir(['bdist_wheel'], '.whl',
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/build_meta.py", line 397, in _build_with_temp_dir
      self.run_setup()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/build_meta.py", line 335, in run_setup
      exec(code, locals())
    File "<string>", line 158, in <module>
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/__init__.py", line 87, in setup
      return distutils.core.setup(**attrs)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
      self.run_command(cmd)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
      super().run_command(command)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
      cmd_obj.run()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run
      self.run_command("build")
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
      self.distribution.run_command(command)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
      super().run_command(command)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
      cmd_obj.run()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
      self.distribution.run_command(command)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
      super().run_command(command)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
      cmd_obj.run()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
      _build_ext.run(self)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
      self.build_extensions()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 871, in build_extensions
      build_ext.build_extensions(self)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions
      self._build_extensions_serial()
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial
      self.build_extension(ext)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
      _build_ext.build_extension(self, ext)
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension
      objects = self.compiler.compile(
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 684, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2112, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  error: subprocess-exited-with-error

  × Building wheel for punica (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/hayley/punica/.venvs/venv310/bin/python3.10 /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpjlrn8pkd
  cwd: /home/hayley/punica
  Building wheel for punica (pyproject.toml) ... error
  ERROR: Failed building wheel for punica
Failed to build punica
ERROR: Could not build wheels for punica, which is required to install pyproject.toml-based projects

[notice] A new release of pip is available: 23.0.1 -> 24.0
[notice] To update, run: pip install --upgrade pip
hayleyhu commented 3 months ago

sudo apt-get install python3.10-dev resolved.

abcdabcd987 commented 3 months ago
  /home/hayley/punica/.venvs/venv310/lib/python3.10/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
     12 | #include <Python.h>
        |          ^~~~~~~~~~
  compilation terminated.

Looks like you don't have Python headers. If you are using system python, install python3-dev on Ubuntu or similar packages on other distro. However, here's my recommendation:

  1. Install Miniforge
  2. Create a new environment with Python 3.10. mamba create py310 python=3.10
  3. Switch to the new environment mamba activate py310
  4. Follow the punica installation guide.
abcdabcd987 commented 3 months ago

sudo apt-get install python3.10-dev resolved.

Great! Glad that you worked it out!