vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.42k stars 4.41k forks source link

ERROR: Fail to install in editable mode. "UserWarning: There are no .../x86_64-conda-linux-gnu-c++ version bounds defined for CUDA version 12.1" #2771

Closed KartikYZ closed 1 month ago

KartikYZ commented 8 months ago

I am unable to install vllm in editable mode using pip install -e . Please advise. I have attached the error log along with environment details.

Here's the error log:

Building wheels for collected packages: vllm
  Building editable for vllm (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building editable for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [330 lines of output]
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      running editable_wheel
      creating /scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info
      writing /scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/PKG-INFO
      writing dependency_links to /scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/dependency_links.txt
      writing requirements to /scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/requires.txt
      writing top-level names to /scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/top_level.txt
      writing manifest file '/scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/SOURCES.txt'
      reading manifest file '/scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/SOURCES.txt'
      reading manifest template 'MANIFEST.in'
      adding license file 'LICENSE'
      writing manifest file '/scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm.egg-info/SOURCES.txt'
      creating '/scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm-0.2.7.dist-info'
      creating /scratch/270889/pip-wheel-t8xj16th/.tmp-47xs7i65/vllm-0.2.7.dist-info/WHEEL
      running build_py
      running build_ext
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-c++ version bounds defined for CUDA version 12.1
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      building 'vllm._C' extension
      creating /scratch/270889/tmpztz3hxtz.build-temp/csrc
      creating /scratch/270889/tmpztz3hxtz.build-temp/csrc/attention
      creating /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization
      creating /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/awq
      creating /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/gptq
      creating /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/squeezellm
      Emitting ninja build file /scratch/270889/tmpztz3hxtz.build-temp/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      [1/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/cuda_utils_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/cuda_utils_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      [2/10] /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-c++ -MMD -MF /scratch/270889/tmpztz3hxtz.build-temp/csrc/pybind.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include -fPIC -O2 -isystem /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include -fPIC -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/pybind.cpp -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/pybind.o -g -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
      [3/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/squeezellm/quant_cuda_kernel.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/squeezellm/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/squeezellm/quant_cuda_kernel.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/squeezellm/quant_cuda_kernel.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/squeezellm/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/ATen/cuda/CUDATensorMethods.cuh: In member function 'T* at::Tensor::data() const [with T = __half]':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/ATen/cuda/CUDATensorMethods.cuh:13:59: warning: 'T* at::Tensor::data() const [with T = c10::Half]' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
         13 |   return reinterpret_cast<__half*>(data<Half>());
            |                                    ~~~~~~~~~~~~~~         ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:247:1: note: declared here
        247 |   T * data() const {
            | ^ ~~
      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/squeezellm/quant_cuda_kernel.cu: In function 'void squeezellm_gemm(at::Tensor, at::Tensor, at::Tensor, at::Tensor)':
      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/squeezellm/quant_cuda_kernel.cu:206:136: warning: 'T* at::Tensor::data() const [with T = c10::Half]' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
        206 |   vllm::squeezellm::NUQ4MatMulKernel<<<blocks, threads, 0, stream>>>(
            |                                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:247:1: note: declared here
        247 |   T * data() const {
            | ^ ~~
      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/squeezellm/quant_cuda_kernel.cu:206:193: warning: 'T* at::Tensor::data() const [with T = c10::Half]' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
        206 |   vllm::squeezellm::NUQ4MatMulKernel<<<blocks, threads, 0, stream>>>(
            |                                                                                                                                                                                                 ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:247:1: note: declared here
        247 |   T * data() const {
            | ^ ~~
      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/squeezellm/quant_cuda_kernel.cu:206:237: warning: 'T* at::Tensor::data() const [with T = c10::Half]' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
        206 |   vllm::squeezellm::NUQ4MatMulKernel<<<blocks, threads, 0, stream>>>(
            |                                                                                                                                                                                                                                             ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:247:1: note: declared here
        247 |   T * data() const {
            | ^ ~~
      [4/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/awq/gemm_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/awq/gemm_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/awq/gemm_kernels.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/awq/gemm_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/awq/gemm_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/awq/gemm_kernels.cu(24): warning #177-D: function "vllm::awq::__pack_half2" was declared but never referenced
        __pack_half2(const half x, const half y) {
        ^

      Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      [5/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/activation_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/activation_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/activation_kernels.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/activation_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/activation_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      [6/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/pos_encoding_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/pos_encoding_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/pos_encoding_kernels.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/pos_encoding_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/pos_encoding_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      [7/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/layernorm_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/layernorm_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/layernorm_kernels.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/layernorm_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/layernorm_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      [8/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/cache_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/cache_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/cache_kernels.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/cache_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/cache_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/cache_kernels.cu(308): warning #550-D: variable "src_key_indices" was set but never used
                int src_key_indices[unroll_factor];
                    ^

      Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

      /storage/ice1/6/5/ksinha45/test2/prowl/csrc/cache_kernels.cu(309): warning #550-D: variable "src_value_indices" was set but never used
                int src_value_indices[unroll_factor];
                    ^

      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      [9/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/gptq/q_gemm.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/gptq/q_gemm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      FAILED: /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/gptq/q_gemm.o
      /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/quantization/gptq/q_gemm.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/quantization/gptq/q_gemm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -gencode arch=compute_70,code=sm_70 --threads 8 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin /storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/bin/x86_64-conda-linux-gnu-cc
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h: In function 'typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)':
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before '<' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                        ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before '<' token
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before '>' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                           ^
      /scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ')' token
         45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
            |                                                                                                                              ^
      [10/10] /usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/bin/nvcc  -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/TH -I/scratch/270889/pip-build-env-uduxn033/overlay/lib/python3.9/site-packages/torch/include/THC -I/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-4.8.5/cuda-12.1.1-6oacj6llkpm7iikvkdenuozwwfwctxxp/include -I/storage/ice1/6/5/ksinha45/micromamba/envs/test_env2/include/python3.9 -c -c /storage/ice1/6/5/ksinha45/test2/prowl/csrc/attention/attention_kernels.cu -o /scratch/270889/tmpztz3hxtz.build-temp/csrc/attention/attention_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2

Other information:

nvcc --version 

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
nvidia-smi

Mon Feb  5 15:11:25 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10              Driver Version: 535.86.10    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100-PCIE-32GB           On  | 00000000:3B:00.0 Off |                    0 |
| N/A   40C    P0              27W / 250W |      0MiB / 32768MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
pip list

Package                   Version
------------------------- ------------
aioprometheus             23.12.0
aiosignal                 1.3.1
annotated-types           0.6.0
anyio                     4.2.0
attrs                     23.2.0
Brotli                    1.1.0
certifi                   2024.2.2
charset-normalizer        3.3.2
click                     8.1.7
cmake                     3.28.1
exceptiongroup            1.2.0
fastapi                   0.109.0
filelock                  3.13.1
frozenlist                1.4.1
fsspec                    2023.12.2
gmpy2                     2.1.2
h11                       0.14.0
httptools                 0.6.1
huggingface-hub           0.20.3
idna                      3.6
Jinja2                    3.1.3
jsonschema                4.21.1
jsonschema-specifications 2023.12.1
lit                       17.0.6
MarkupSafe                2.1.5
mpmath                    1.3.0
msgpack                   1.0.7
networkx                  3.2.1
ninja                     1.11.1.1
numpy                     1.26.3
nvidia-cublas-cu12        12.1.3.1
nvidia-cuda-cupti-cu12    12.1.105
nvidia-cuda-nvrtc-cu12    12.1.105
nvidia-cuda-runtime-cu12  12.1.105
nvidia-cudnn-cu12         8.9.2.26
nvidia-cufft-cu12         11.0.2.54
nvidia-curand-cu12        10.3.2.106
nvidia-cusolver-cu12      11.4.5.107
nvidia-cusparse-cu12      12.1.0.106
nvidia-nccl-cu12          2.18.1
nvidia-nvjitlink-cu12     12.3.101
nvidia-nvtx-cu12          12.1.105
orjson                    3.9.12
packaging                 23.2
Pillow                    9.4.0
pip                       24.0
protobuf                  4.25.2
psutil                    5.9.8
pydantic                  2.6.0
pydantic_core             2.16.1
pynvml                    11.5.0
PySocks                   1.7.1
python-dotenv             1.0.1
PyYAML                    6.0.1
quantile-python           1.1
ray                       2.9.1
referencing               0.33.0
regex                     2023.12.25
requests                  2.31.0
rpds-py                   0.17.1
safetensors               0.4.2
sentencepiece             0.1.99
setuptools                69.0.3
sniffio                   1.3.0
starlette                 0.35.1
sympy                     1.12
tokenizers                0.15.1
torch                     2.1.2
torchaudio                2.1.2
torchvision               0.16.2
tqdm                      4.66.1
transformers              4.37.2
triton                    2.0.0
typing_extensions         4.9.0
urllib3                   2.2.0
uvicorn                   0.27.0.post1
uvloop                    0.19.0
vllm                      0.3.0
watchfiles                0.21.0
websockets                12.0
wheel                     0.42.0
xformers                  0.0.23.post1
micromamba list

  Name                         Version     Build                        Channel    
─────────────────────────────────────────────────────────────────────────────────────
  _libgcc_mutex                0.1         conda_forge                  conda-forge
  _openmp_mutex                4.5         2_kmp_llvm                   conda-forge
  binutils                     2.36.1      hdd6e379_2                   conda-forge
  binutils_impl_linux-64       2.36.1      h193b22a_2                   conda-forge
  binutils_linux-64            2.36        hf3e587d_10                  conda-forge
  blas                         2.116       mkl                          conda-forge
  blas-devel                   3.9.0       16_linux64_mkl               conda-forge
  brotli-python                1.1.0       py39h3d6467e_1               conda-forge
  bzip2                        1.0.8       hd590300_5                   conda-forge
  c-compiler                   1.6.0       hd590300_0                   conda-forge
  ca-certificates              2024.2.2    hbcca054_0                   conda-forge
  certifi                      2024.2.2    pyhd8ed1ab_0                 conda-forge
  charset-normalizer           3.3.2       pyhd8ed1ab_0                 conda-forge
  cuda-cccl                    12.3.101    0                            nvidia     
  cuda-cccl_linux-64           12.3.101    ha770c72_0                   conda-forge
  cuda-cudart                  12.1.105    0                            nvidia     
  cuda-cudart-dev              12.1.105    0                            nvidia     
  cuda-cudart-dev_linux-64     12.3.101    h59595ed_0                   conda-forge
  cuda-cudart-static           12.3.101    hd3aeb46_0                   conda-forge
  cuda-cudart-static_linux-64  12.3.101    h59595ed_0                   conda-forge
  cuda-cudart_linux-64         12.3.101    h59595ed_0                   conda-forge
  cuda-cupti                   12.1.105    0                            nvidia     
  cuda-libraries               12.1.0      0                            nvidia     
  cuda-nvrtc                   12.1.105    0                            nvidia     
  cuda-nvtx                    12.1.105    0                            nvidia     
  cuda-opencl                  12.3.101    h59595ed_0                   conda-forge
  cuda-runtime                 12.1.0      0                            nvidia     
  cuda-version                 12.3        h32bc705_2                   conda-forge
  cxx-compiler                 1.6.0       h00ab1b0_0                   conda-forge
  ffmpeg                       4.3         hf484d3e_0                   pytorch    
  filelock                     3.13.1      pyhd8ed1ab_0                 conda-forge
  freetype                     2.12.1      h267a509_2                   conda-forge
  gcc                          12.1.0      h9ea6d83_10                  conda-forge
  gcc_impl_linux-64            12.1.0      hea43390_17                  conda-forge
  gcc_linux-64                 12.1.0      h3bb4806_10                  conda-forge
  gmp                          6.3.0       h59595ed_0                   conda-forge
  gmpy2                        2.1.2       py39h376b7d2_1               conda-forge
  gnutls                       3.6.13      h85f3911_1                   conda-forge
  gxx                          12.1.0      h9ea6d83_10                  conda-forge
  gxx_impl_linux-64            12.1.0      hea43390_17                  conda-forge
  gxx_linux-64                 12.1.0      h1f501c1_10                  conda-forge
  icu                          73.2        h59595ed_0                   conda-forge
  idna                         3.6         pyhd8ed1ab_0                 conda-forge
  jinja2                       3.1.3       pyhd8ed1ab_0                 conda-forge
  jpeg                         9e          h166bdaf_2                   conda-forge
  kernel-headers_linux-64      2.6.32      he073ed8_16                  conda-forge
  lame                         3.100       h166bdaf_1003                conda-forge
  lcms2                        2.15        hfd0df8a_0                   conda-forge
  ld_impl_linux-64             2.36.1      hea4e1c9_2                   conda-forge
  lerc                         4.0.0       h27087fc_0                   conda-forge
  libblas                      3.9.0       16_linux64_mkl               conda-forge
  libcblas                     3.9.0       16_linux64_mkl               conda-forge
  libcublas                    12.1.0.26   0                            nvidia     
  libcublas-dev                12.1.0.26   0                            nvidia     
  libcufft                     11.0.2.4    0                            nvidia     
  libcufile                    1.8.1.2     hd3aeb46_0                   conda-forge
  libcurand                    10.3.4.107  hd3aeb46_0                   conda-forge
  libcusolver                  11.4.4.55   0                            nvidia     
  libcusolver-dev              11.4.4.55   0                            nvidia     
  libcusparse                  12.0.2.55   0                            nvidia     
  libcusparse-dev              12.0.2.55   0                            nvidia     
  libdeflate                   1.17        h0b41bf4_0                   conda-forge
  libffi                       3.4.2       h7f98852_5                   conda-forge
  libgcc-devel_linux-64        12.1.0      h1ec3361_17                  conda-forge
  libgcc-ng                    13.2.0      h807b86a_5                   conda-forge
  libgfortran-ng               13.2.0      h69a702a_5                   conda-forge
  libgfortran5                 13.2.0      ha4646dd_5                   conda-forge
  libgomp                      13.2.0      h807b86a_5                   conda-forge
  libhwloc                     2.9.3       default_h554bfaf_1009        conda-forge
  libiconv                     1.17        hd590300_2                   conda-forge
  libjpeg-turbo                2.0.0       h9bf148f_0                   pytorch    
  liblapack                    3.9.0       16_linux64_mkl               conda-forge
  liblapacke                   3.9.0       16_linux64_mkl               conda-forge
  libnpp                       12.0.2.50   0                            nvidia     
  libnsl                       2.0.1       hd590300_0                   conda-forge
  libnvjitlink                 12.1.105    0                            nvidia     
  libnvjpeg                    12.1.1.14   0                            nvidia     
  libpng                       1.6.42      h2797004_0                   conda-forge
  libsanitizer                 12.1.0      ha89aaad_17                  conda-forge
  libsqlite                    3.44.2      h2797004_0                   conda-forge
  libstdcxx-devel_linux-64     12.1.0      h1ec3361_17                  conda-forge
  libstdcxx-ng                 13.2.0      h7e041cc_5                   conda-forge
  libtiff                      4.5.0       h6adf6a1_2                   conda-forge
  libuuid                      2.38.1      h0b41bf4_0                   conda-forge
  libwebp-base                 1.3.2       hd590300_0                   conda-forge
  libxcb                       1.13        h7f98852_1004                conda-forge
  libxcrypt                    4.4.36      hd590300_1                   conda-forge
  libxml2                      2.12.4      h232c23b_1                   conda-forge
  libzlib                      1.2.13      hd590300_5                   conda-forge
  llvm-openmp                  15.0.7      h0cdce71_0                   conda-forge
  markupsafe                   2.1.5       py39hd1e30aa_0               conda-forge
  mkl                          2022.1.0    h84fe81f_915                 conda-forge
  mkl-devel                    2022.1.0    ha770c72_916                 conda-forge
  mkl-include                  2022.1.0    h84fe81f_915                 conda-forge
  mpc                          1.3.1       hfe3b2da_0                   conda-forge
  mpfr                         4.2.1       h9458935_0                   conda-forge
  mpmath                       1.3.0       pyhd8ed1ab_0                 conda-forge
  ncurses                      6.4         h59595ed_2                   conda-forge
  nettle                       3.6         he412f7d_0                   conda-forge
  networkx                     3.2.1       pyhd8ed1ab_0                 conda-forge
  numpy                        1.26.3      py39h474f0d3_0               conda-forge
  ocl-icd                      2.3.1       h7f98852_0                   conda-forge
  openh264                     2.1.1       h780b84a_0                   conda-forge
  openjpeg                     2.5.0       hfec8fc6_2                   conda-forge
  openssl                      3.2.1       hd590300_0                   conda-forge
  pillow                       9.4.0       py39h2320bf1_1               conda-forge
  pip                          24.0        pyhd8ed1ab_0                 conda-forge
  pthread-stubs                0.4         h36c2ea0_1001                conda-forge
  pysocks                      1.7.1       pyha2e5f31_6                 conda-forge
  python                       3.9.18      h0755675_1_cpython           conda-forge
  python_abi                   3.9         4_cp39                       conda-forge
  pytorch                      2.1.2       py3.9_cuda12.1_cudnn8.9.2_0  pytorch    
  pytorch-cuda                 12.1        ha16c6d3_5                   pytorch    
  pytorch-mutex                1.0         cuda                         pytorch    
  pyyaml                       6.0.1       py39hd1e30aa_1               conda-forge
  readline                     8.2         h8228510_1                   conda-forge
  requests                     2.31.0      pyhd8ed1ab_0                 conda-forge
  setuptools                   69.0.3      pyhd8ed1ab_0                 conda-forge
  sympy                        1.12        pypyh9d50eac_103             conda-forge
  sysroot_linux-64             2.12        he073ed8_16                  conda-forge
  tbb                          2021.11.0   h00ab1b0_1                   conda-forge
  tk                           8.6.13      noxft_h4845f30_101           conda-forge
  torchaudio                   2.1.2       py39_cu121                   pytorch    
  torchtriton                  2.1.0       py39                         pytorch    
  torchvision                  0.16.2      py39_cu121                   pytorch    
  typing_extensions            4.9.0       pyha770c72_0                 conda-forge
  tzdata                       2024a       h0c530f3_0                   conda-forge
  urllib3                      2.2.0       pyhd8ed1ab_0                 conda-forge
  wheel                        0.42.0      pyhd8ed1ab_0                 conda-forge
  xorg-libxau                  1.0.11      hd590300_0                   conda-forge
  xorg-libxdmcp                1.1.3       h7f98852_0                   conda-forge
  xz                           5.2.6       h166bdaf_0                   conda-forge
  yaml                         0.2.5       h7f98852_2                   conda-forge
  zlib                         1.2.13      hd590300_5                   conda-forge
  zstd                         1.5.5       hfc55251_0                   conda-forge
module list

Currently Loaded Modules:
  1) bzip2/1.0.8-z5cmka   (H)   5) ncurses/6.2-qhoz4g      (H)   9) libffi/3.4.2-bvfjil        (H)  13) util-linux-uuid/2.36.2-6u5eni (H)  17) libxml2/2.9.13-d4fgiv (H)
  2) libmd/1.0.4-wdkbs3   (H)   6) readline/8.1-v3ivmo     (H)  10) openssl/1.0.2k-fips-xbtc42 (H)  14) python/3.9.12-rkxvr6               18) cuda/12.1.1-6oacj6
  3) libbsd/0.11.5-j4ccxs (H)   7) gdbm/1.19-54ea7n        (H)  11) zlib/1.2.7-s3gked          (H)  15) libiconv/1.16-pbdcxj          (H)  19) anaconda3/2022.05.0.1
  4) expat/2.4.8-kng6xl         8) gettext/0.19.8.1-yz6qtc      12) sqlite/3.38.5-sweldt       (H)  16) xz/5.2.2-kbeci4               (H)

  Where:
   H:  Hidden Module

Thanks you for your help.

hmellor commented 7 months ago

Have you tried installing in a virtual environment to eliminate any potential incompatibilities with your global environment?

hmellor commented 1 month ago

Stale