turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.57k stars 273 forks source link

CU12+ appears to be unsupported? #148

Closed SebJansen closed 9 months ago

SebJansen commented 11 months ago

pip installing the JIT version fails so below I have pasted the error message for the build from source. I think a temporary workaround for me would be to grab a dockerfile that has CU11.8, because my rolling release distro updated cuda for me and I suspect that this is what made it stop working.

My system:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Sep__8_19:17:24_PDT_2023
Cuda compilation tools, release 12.3, V12.3.52
Build cuda_12.3.r12.3/compiler.33281558_0

$ uname -r
6.5.9-arch2-1

$ gcc --version
gcc (GCC) 13.2.1 20230801

$ gcc-12 --version
gcc-12 (GCC) 12.3.0

Error message build from source

python setup.py install --user
Version: 0.0.7
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
TEST FAILED: /home/seb/.local/lib/python3.11/site-packages/ does NOT support .pth files
bad install directory or PYTHONPATH

You are attempting to install a package to a directory that is not
on PYTHONPATH and which Python does not read ".pth" files from.  The
installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:

    /home/seb/.local/lib/python3.11/site-packages/

and your PYTHONPATH environment variable currently contains:

    ''

Here are some of your options for correcting the problem:

* You can choose a different installation directory, i.e., one that is
  on PYTHONPATH or supports .pth files

* You can add the installation directory to the PYTHONPATH environment
  variable.  (It must then also be on PYTHONPATH whenever you run
  Python and want to use the package(s) you are installing.)

* You can set up the installation directory to support ".pth" files by
  using one of the approaches described here:

  https://setuptools.pypa.io/en/latest/deprecated/easy_install.html#custom-installation-locations

Please make the appropriate changes for your system and try again.
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching 'dni_*' found anywhere in distribution
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_py.py:202: SetuptoolsDeprecationWarning:     Installing 'exllamav2.exllamav2_ext' as data is deprecated, please list it in `packages`.
    !!

    ############################
    # Package would be ignored #
    ############################
    Python recognizes 'exllamav2.exllamav2_ext' as an importable package,
    but it is not listed in the `packages` configuration of setuptools.

    'exllamav2.exllamav2_ext' has been automatically added to the distribution only
    because it may contain data files, but this behavior is likely to change
    in future versions of setuptools (and therefore is considered deprecated).

    Please make sure that 'exllamav2.exllamav2_ext' is included as a package by using
    the `packages` configuration field or the proper discovery methods
    (for example by using `find_namespace_packages(...)`/`find_namespace:`
    instead of `find_packages(...)`/`find:`).

    You can read more about "package discovery" and "data files" on setuptools
    documentation page.

!!

  check.warn(importable)
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_py.py:202: SetuptoolsDeprecationWarning:     Installing 'exllamav2.exllamav2_ext.cpp' as data is deprecated, please list it in `packages`.
    !!

    ############################
    # Package would be ignored #
    ############################
    Python recognizes 'exllamav2.exllamav2_ext.cpp' as an importable package,
    but it is not listed in the `packages` configuration of setuptools.

    'exllamav2.exllamav2_ext.cpp' has been automatically added to the distribution only
    because it may contain data files, but this behavior is likely to change
    in future versions of setuptools (and therefore is considered deprecated).

    Please make sure that 'exllamav2.exllamav2_ext.cpp' is included as a package by using
    the `packages` configuration field or the proper discovery methods
    (for example by using `find_namespace_packages(...)`/`find_namespace:`
    instead of `find_packages(...)`/`find:`).

    You can read more about "package discovery" and "data files" on setuptools
    documentation page.

!!

  check.warn(importable)
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_py.py:202: SetuptoolsDeprecationWarning:     Installing 'exllamav2.exllamav2_ext.cuda' as data is deprecated, please list it in `packages`.
    !!

    ############################
    # Package would be ignored #
    ############################
    Python recognizes 'exllamav2.exllamav2_ext.cuda' as an importable package,
    but it is not listed in the `packages` configuration of setuptools.

    'exllamav2.exllamav2_ext.cuda' has been automatically added to the distribution only
    because it may contain data files, but this behavior is likely to change
    in future versions of setuptools (and therefore is considered deprecated).

    Please make sure that 'exllamav2.exllamav2_ext.cuda' is included as a package by using
    the `packages` configuration field or the proper discovery methods
    (for example by using `find_namespace_packages(...)`/`find_namespace:`
    instead of `find_packages(...)`/`find:`).

    You can read more about "package discovery" and "data files" on setuptools
    documentation page.

!!

  check.warn(importable)
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_py.py:202: SetuptoolsDeprecationWarning:     Installing 'exllamav2.exllamav2_ext.cuda.quant' as data is deprecated, please list it in `packages`.
    !!

    ############################
    # Package would be ignored #
    ############################
    Python recognizes 'exllamav2.exllamav2_ext.cuda.quant' as an importable package,
    but it is not listed in the `packages` configuration of setuptools.

    'exllamav2.exllamav2_ext.cuda.quant' has been automatically added to the distribution only
    because it may contain data files, but this behavior is likely to change
    in future versions of setuptools (and therefore is considered deprecated).

    Please make sure that 'exllamav2.exllamav2_ext.cuda.quant' is included as a package by using
    the `packages` configuration field or the proper discovery methods
    (for example by using `find_namespace_packages(...)`/`find_namespace:`
    instead of `find_packages(...)`/`find:`).

    You can read more about "package discovery" and "data files" on setuptools
    documentation page.

!!

  check.warn(importable)
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_py.py:202: SetuptoolsDeprecationWarning:     Installing 'exllamav2.generator.filters' as data is deprecated, please list it in `packages`.
    !!

    ############################
    # Package would be ignored #
    ############################
    Python recognizes 'exllamav2.generator.filters' as an importable package,
    but it is not listed in the `packages` configuration of setuptools.

    'exllamav2.generator.filters' has been automatically added to the distribution only
    because it may contain data files, but this behavior is likely to change
    in future versions of setuptools (and therefore is considered deprecated).

    Please make sure that 'exllamav2.generator.filters' is included as a package by using
    the `packages` configuration field or the proper discovery methods
    (for example by using `find_namespace_packages(...)`/`find_namespace:`
    instead of `find_packages(...)`/`find:`).

    You can read more about "package discovery" and "data files" on setuptools
    documentation page.

!!

  check.warn(importable)
/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_py.py:202: SetuptoolsDeprecationWarning:     Installing 'exllamav2.server' as data is deprecated, please list it in `packages`.
    !!

    ############################
    # Package would be ignored #
    ############################
    Python recognizes 'exllamav2.server' as an importable package,
    but it is not listed in the `packages` configuration of setuptools.

    'exllamav2.server' has been automatically added to the distribution only
    because it may contain data files, but this behavior is likely to change
    in future versions of setuptools (and therefore is considered deprecated).

    Please make sure that 'exllamav2.server' is included as a package by using
    the `packages` configuration field or the proper discovery methods
    (for example by using `find_namespace_packages(...)`/`find_namespace:`
    instead of `find_packages(...)`/`find:`).

    You can read more about "package discovery" and "data files" on setuptools
    documentation page.

!!

  check.warn(importable)
/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py:414: UserWarning: The detected CUDA version (12.3) has a minor version mismatch with the version that was used to compile PyTorch (12.1). Most likely this shouldn't be a problem.
  warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no g++ version bounds defined for CUDA version 12.3
  warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
Emitting ninja build file /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/rope.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/rope.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/rope.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/rope.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/rope.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/rope.cu".
[2/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/cache.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/cache.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/cache.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/cache.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/cache.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/cache.cu".
[3/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/rms_norm.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/rms_norm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/rms_norm.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/rms_norm.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/rms_norm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/rms_norm.cu".
[4/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/pack_tensor.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/pack_tensor.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/pack_tensor.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/pack_tensor.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/pack_tensor.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/pack_tensor.cu".
[5/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_matrix.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_matrix.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_matrix.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_matrix.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_matrix.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_matrix.cu".
[6/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/quantize.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/quantize.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/quantize.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/quantize.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/quantize.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/quantize.cu".
[7/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_gemm.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_gemm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_gemm.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_gemm.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_gemm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_gemm.cu".
[8/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_mlp.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_mlp.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_mlp.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_mlp.cu".
[9/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/lora.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/lora.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/lora.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/lora.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/lora.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/lora.cu".
[10/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/h_gemm.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/h_gemm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/h_gemm.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/h_gemm.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/h_gemm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/h_gemm.cu".
[11/11] /opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_attn.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_attn.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
FAILED: /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_attn.o 
/opt/cuda/bin/nvcc  -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -I/home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/seb/.venvs/x/include -I/usr/include/python3.11 -c -c /home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_attn.cu -o /home/seb/Code/x/exllamav2/build/temp.linux-x86_64-cpython-311/exllamav2/exllamav2_ext/cuda/q_attn.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/Code/x/exllamav2/exllamav2/exllamav2_ext/cuda/q_attn.cu".
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/seb/Code/x/exllamav2/setup.py", line 57, in <module>
    setup(
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
           ^^^^^^^^^^^^^^^^^^
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
    self.run_command(cmd)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/install.py", line 74, in run
    self.do_egg_install()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/install.py", line 123, in do_egg_install
    self.run_command('bdist_egg')
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
    self.distribution.run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/bdist_egg.py", line 165, in run
    cmd = self.call_command('install_lib', warn_dir=0)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/bdist_egg.py", line 151, in call_command
    self.run_command(cmdname)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
    self.distribution.run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/install_lib.py", line 11, in run
    self.build()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/command/install_lib.py", line 112, in build
    self.run_command('build_ext')
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
    self.distribution.run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
    self.build_extensions()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
    build_ext.build_extensions(self)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions
    self._build_extensions_serial()
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension
    objects = self.compiler.compile(
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Errror message pip install JIT

$ python test.py 
Traceback (most recent call last):
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/ext.py", line 14, in <module>
    import exllamav2_ext
ModuleNotFoundError: No module named 'exllamav2_ext'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/seb/Code/x/test.py", line 1, in <module>
    from lib.llm.prepare import load
  File "/home/seb/Code/x/lib/llm/prepare.py", line 1, in <module>
    from exllamav2 import ExLlamaV2Config, ExLlamaV2, ExLlamaV2Tokenizer, ExLlamaV2Cache
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/model.py", line 11, in <module>
    from exllamav2.cache import ExLlamaV2Cache
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/cache.py", line 2, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/ext.py", line 124, in <module>
    exllamav2_ext = load \
                    ^^^^^^
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/seb/.venvs/x/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'exllamav2_ext': [1/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/pack_tensor.cu -o pack_tensor.cuda.o 
FAILED: pack_tensor.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/pack_tensor.cu -o pack_tensor.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/pack_tensor.cu".
[2/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/rms_norm.cu -o rms_norm.cuda.o 
FAILED: rms_norm.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/rms_norm.cu -o rms_norm.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/rms_norm.cu".
[3/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_matrix.cu -o q_matrix.cuda.o 
FAILED: q_matrix.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_matrix.cu -o q_matrix.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_matrix.cu".
[4/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/rope.cu -o rope.cuda.o 
FAILED: rope.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/rope.cu -o rope.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/rope.cu".
[5/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/cache.cu -o cache.cuda.o 
FAILED: cache.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/cache.cu -o cache.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/cache.cu".
[6/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/quantize.cu -o quantize.cuda.o 
FAILED: quantize.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/quantize.cu -o quantize.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/quantize.cu".
[7/15] c++ -MMD -MF sampling.o.d -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cpp/sampling.cpp -o sampling.o 
[8/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/lora.cu -o lora.cuda.o 
FAILED: lora.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/lora.cu -o lora.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/lora.cu".
[9/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_attn.cu -o q_attn.cuda.o 
FAILED: q_attn.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_attn.cu -o q_attn.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_attn.cu".
[10/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/h_gemm.cu -o h_gemm.cuda.o 
FAILED: h_gemm.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/h_gemm.cu -o h_gemm.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/h_gemm.cu".
[11/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_gemm.cu -o q_gemm.cuda.o 
FAILED: q_gemm.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_gemm.cu -o q_gemm.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_gemm.cu".
[12/15] /opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_mlp.cu -o q_mlp.cuda.o 
FAILED: q_mlp.cuda.o 
/opt/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_mlp.cu -o q_mlp.cuda.o 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cuda/q_mlp.cu".
[13/15] c++ -MMD -MF quantize_func.o.d -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/cpp/quantize_func.cpp -o quantize_func.o 
[14/15] c++ -MMD -MF ext.o.d -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/TH -isystem /home/seb/.venvs/x/lib/python3.11/site-packages/torch/include/THC -isystem /opt/cuda/include -isystem /usr/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/seb/.venvs/x/lib/python3.11/site-packages/exllamav2/exllamav2_ext/ext.cpp -o ext.o 
ninja: build stopped: subcommand failed.
turboderp commented 11 months ago

It does look like something breaks with CUDA 12.3. Quick search brings up a lot of results for that error, and apparently the workaround is something like:

export HOST_COMPILER=/usr/bin/g++-12

I'll have to look into it later, maybe upgrade and see if I can reproduce it here.

picobyte commented 11 months ago

the exllamav2-0.0.7+cu118-cp310-cp310-linux_x86_64.whl installs cu12 EDIT: may have been resolved in the latest version.

lhl commented 11 months ago

This just bit me as well. Doing a search on the error message brought up this thread on nvcc. Here's the post from the Nvidia moderator:

The supported/tested gcc versions for any given CUDA version can be found in the CUDA linux install guide for that CUDA version. At the moment, here 6 (and here 10) is the one for 12.3, and you can see that gcc 13.x is not listed anywhere.

So there is no expectation by NVIDIA that CUDA 12.3 works with gcc 13.x. For CUDA 12.3, the stated gcc support goes up to gcc 12.2.

In my conda/mamba env, to get things working I installed:

mamba install gxx=12.2
oldgithubman commented 6 months ago

Having a similar problem:

Building wheels for collected packages: exllamav2
  Building wheel for exllamav2 (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [50 lines of output]
      Version: 0.0.17
      warning: no previously-included files matching 'dni_*' found anywhere in distribution
      /home/j/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:415: UserWarning: The detected CUDA version (12.4) has a minor version mismatch with the version that was used to compile PyTorch (12.1). Most likely this shouldn't be a problem.
        warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
      /home/j/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:425: UserWarning: There are no /home/j/miniforge3/envs/exllamav2/bin/x86_64-conda-linux-gnu-c++ version bounds defined for CUDA version 12.4
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      Emitting ninja build file /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      1.11.1.git.kitware.jobserver-1
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cpp/quantize_func.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cpp/safetensors.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cpp/sampling.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/cache.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/kernel_select.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_1a.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_1b.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_2a.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_2b.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_3a.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_3b.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_gptq_1.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_gptq_2.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_gptq_3.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/h_add.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/h_gemm.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/layer_norm.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/lora.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/pack_tensor.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_attn.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_gemm.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_matrix.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_mlp.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/quantize.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/rms_norm.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/rope.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/util.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_bindings.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_cache.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_gemm.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_norm.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_qattn.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_qmatrix.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_qmlp.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_quant.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_rope.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_safetensors.o: No such file or directory
      /home/j/miniforge3/envs/exllamav2/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_sampling.o: No such file or directory
      collect2: error: ld returned 1 exit status
      error: command '/home/j/miniforge3/envs/exllamav2/bin/x86_64-conda-linux-gnu-c++' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for exllamav2
  Running setup.py clean for exllamav2
Failed to build exllamav2
ERROR: Could not build wheels for exllamav2, which is required to install pyproject.toml-based projects
siahuat0727 commented 6 months ago

Having a similar problem:

Maybe try pip install --upgrade pip setuptools wheel before the installation. @oldmanjk

turboderp commented 6 months ago

It looks like it's not even trying to compile the sources and skipping straight to the linker even though there are no object files to link.

Try clearing out ~/.cache/torch_extensions perhaps? And also if you have a build directory in the repo dir, delete that.

oldgithubman commented 6 months ago

Having a similar problem:

Maybe try pip install --upgrade pip setuptools wheel before the installation. @oldmanjk

Thanks for the suggestion, but nope.

~/exllamav2$ pip install --upgrade pip setuptools wheel
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /home/j/.local/lib/python3.10/site-packages (24.0)
Requirement already satisfied: setuptools in /home/j/.local/lib/python3.10/site-packages (69.2.0)
Requirement already satisfied: wheel in /home/j/.local/lib/python3.10/site-packages (0.43.0)
oldgithubman commented 6 months ago

It looks like it's not even trying to compile the sources and skipping straight to the linker even though there are no object files to link.

Try clearing out ~/.cache/torch_extensions perhaps? And also if you have a build directory in the repo dir, delete that.

Thanks for the suggestion, but nope. Cleared out ~/.cache/torch_extensions. No build directory.

~/exllamav2$ pip install .
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Processing /home/j/exllamav2
  Preparing metadata (setup.py) ... done
Requirement already satisfied: pandas in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (2.2.1)
Requirement already satisfied: ninja in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (1.11.1.1)
Requirement already satisfied: fastparquet in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (2024.2.0)
Requirement already satisfied: torch>=2.2.0 in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (2.2.1)
Requirement already satisfied: safetensors>=0.3.2 in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (0.4.2)
Requirement already satisfied: sentencepiece>=0.1.97 in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (0.1.99)
Requirement already satisfied: pygments in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (2.17.2)
Requirement already satisfied: websockets in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (11.0.3)
Requirement already satisfied: regex in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (2023.12.25)
Requirement already satisfied: numpy in /home/j/.local/lib/python3.10/site-packages (from exllamav2==0.0.17) (1.24.4)
Requirement already satisfied: filelock in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (3.13.3)
Requirement already satisfied: typing-extensions>=4.8.0 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (4.10.0)
Requirement already satisfied: sympy in /usr/lib/python3/dist-packages (from torch>=2.2.0->exllamav2==0.0.17) (1.9)
Requirement already satisfied: networkx in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (3.2.1)
Requirement already satisfied: jinja2 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (3.1.2)
Requirement already satisfied: fsspec in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (2024.2.0)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (12.1.105)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (8.9.2.26)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (11.0.2.54)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (10.3.2.106)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (11.4.5.107)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (12.1.0.106)
Requirement already satisfied: nvidia-nccl-cu12==2.19.3 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (2.19.3)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (12.1.105)
Requirement already satisfied: triton==2.2.0 in /home/j/.local/lib/python3.10/site-packages (from torch>=2.2.0->exllamav2==0.0.17) (2.2.0)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/j/.local/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=2.2.0->exllamav2==0.0.17) (12.4.99)
Requirement already satisfied: cramjam>=2.3 in /home/j/.local/lib/python3.10/site-packages (from fastparquet->exllamav2==0.0.17) (2.8.3)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from fastparquet->exllamav2==0.0.17) (23.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/j/.local/lib/python3.10/site-packages (from pandas->exllamav2==0.0.17) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas->exllamav2==0.0.17) (2022.1)
Requirement already satisfied: tzdata>=2022.7 in /home/j/.local/lib/python3.10/site-packages (from pandas->exllamav2==0.0.17) (2024.1)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas->exllamav2==0.0.17) (1.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in /home/j/.local/lib/python3.10/site-packages (from jinja2->torch>=2.2.0->exllamav2==0.0.17) (2.1.5)
Building wheels for collected packages: exllamav2
  Building wheel for exllamav2 (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [379 lines of output]
      Version: 0.0.17
      warning: no previously-included files matching '*.pyc' found anywhere in distribution
      warning: no previously-included files matching 'dni_*' found anywhere in distribution
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.exllamav2_ext' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.exllamav2_ext' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.exllamav2_ext' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.exllamav2_ext' to be distributed and are
              already explicitly excluding 'exllamav2.exllamav2_ext' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.exllamav2_ext.cpp' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.exllamav2_ext.cpp' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.exllamav2_ext.cpp' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.exllamav2_ext.cpp' to be distributed and are
              already explicitly excluding 'exllamav2.exllamav2_ext.cpp' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.exllamav2_ext.cuda' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.exllamav2_ext.cuda' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.exllamav2_ext.cuda' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.exllamav2_ext.cuda' to be distributed and are
              already explicitly excluding 'exllamav2.exllamav2_ext.cuda' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.exllamav2_ext.cuda.comp_units' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.exllamav2_ext.cuda.comp_units' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.exllamav2_ext.cuda.comp_units' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.exllamav2_ext.cuda.comp_units' to be distributed and are
              already explicitly excluding 'exllamav2.exllamav2_ext.cuda.comp_units' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.exllamav2_ext.cuda.quant' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.exllamav2_ext.cuda.quant' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.exllamav2_ext.cuda.quant' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.exllamav2_ext.cuda.quant' to be distributed and are
              already explicitly excluding 'exllamav2.exllamav2_ext.cuda.quant' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.generator.filters' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.generator.filters' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.generator.filters' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.generator.filters' to be distributed and are
              already explicitly excluding 'exllamav2.generator.filters' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.server' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.server' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.server' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.server' to be distributed and are
              already explicitly excluding 'exllamav2.server' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'exllamav2.tokenizer' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'exllamav2.tokenizer' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'exllamav2.tokenizer' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'exllamav2.tokenizer' to be distributed and are
              already explicitly excluding 'exllamav2.tokenizer' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html

              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************

      !!
        check.warn(importable)
      /home/j/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:415: UserWarning: The detected CUDA version (12.4) has a minor version mismatch with the version that was used to compile PyTorch (12.1). Most likely this shouldn't be a problem.
        warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
      /home/j/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:425: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 12.4
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      Emitting ninja build file /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      1.11.1.git.kitware.jobserver-1
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cpp/quantize_func.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cpp/safetensors.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cpp/sampling.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/cache.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/kernel_select.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_1a.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_1b.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_2a.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_2b.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_3a.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_exl2_3b.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_gptq_1.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_gptq_2.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/comp_units/unit_gptq_3.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/h_add.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/h_gemm.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/layer_norm.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/lora.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/pack_tensor.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_attn.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_gemm.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_matrix.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/q_mlp.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/quantize.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/rms_norm.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/rope.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/cuda/util.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_bindings.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_cache.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_gemm.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_norm.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_qattn.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_qmatrix.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_qmlp.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_quant.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_rope.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_safetensors.o: No such file or directory
      /usr/bin/ld: cannot find /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/exllamav2/exllamav2_ext/ext_sampling.o: No such file or directory
      collect2: error: ld returned 1 exit status
      error: command '/usr/bin/x86_64-linux-gnu-g++' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for exllamav2
  Running setup.py clean for exllamav2
Failed to build exllamav2
ERROR: Could not build wheels for exllamav2, which is required to install pyproject.toml-based projects
turboderp commented 6 months ago

It's very odd. It's clearly attempting to compile even though it's not finding the compiler. But instead of stopping with an error message, it continues and tries to link the object that were never compiled. What do you get from gcc --version and nvcc --version?

oldgithubman commented 6 months ago

It's very odd. It's clearly attempting to compile even though it's not finding the compiler. But instead of stopping with an error message, it continues and tries to link the object that were never compiled. What do you get from gcc --version and nvcc --version?

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
turboderp commented 6 months ago

Are you sure there isn't a build directory? According to the output it writes the Ninja build file to /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/build.ninja. If it is there, try clearing that whole directory, and if it still doesn't work, maybe post the build.ninja file?

oldgithubman commented 6 months ago

Are you sure there isn't a build directory? According to the output it writes the Ninja build file to /home/j/exllamav2/build/temp.linux-x86_64-cpython-310/build.ninja. If it is there, try clearing that whole directory, and if it still doesn't work, maybe post the build.ninja file?

Yes, I'm sure. It creates a build directory during the process, but deletes it when it's done. It just magically started working though

Katehuuh commented 1 month ago

I have same error RuntimeError: Error compiling objects for extension by using "-allow-unsupported-compiler" after upgrade latest version of MSVC.


      C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.41.34120/include\yvals_core.h(888): error: static assertion failed with "error STL1002: Unexpected compiler version, expected CUDA 12.4 or newer."

        static_assert(false, "error " "STL1002" ": " "Unexpected compiler version, expected CUDA 12.4 or newer.");

        ^

Errors for no extra_cuda_cflags=: "-allow-unsupported-compiler" in setup.py return "Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler'...". However I already have VS 2022.

logs no extra arg "-allow-unsupported-compiler"
```cmd (venv) C:\exllamav2>set DISTUTILS_USE_SDK=1 (venv) C:\exllamav2>pip install . Processing c:\exllamav2 Preparing metadata (setup.py) ... done Requirement already satisfied: pandas in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (2.2.2) Requirement already satisfied: ninja in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (1.11.1.1) Requirement already satisfied: fastparquet in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (2024.5.0) Requirement already satisfied: torch>=2.2.0 in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (2.4.0+cu121) Requirement already satisfied: safetensors>=0.3.2 in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (0.4.4) Requirement already satisfied: sentencepiece>=0.1.97 in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (0.2.0) Requirement already satisfied: pygments in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (2.18.0) Requirement already satisfied: websockets in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (12.0) Requirement already satisfied: regex in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (2024.7.24) Requirement already satisfied: numpy in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (1.26.4) Requirement already satisfied: rich in c:\exllamav2\venv\lib\site-packages (from exllamav2==0.1.8) (13.7.1) Requirement already satisfied: filelock in c:\exllamav2\venv\lib\site-packages (from torch>=2.2.0->exllamav2==0.1.8) (3.13.1) Requirement already satisfied: typing-extensions>=4.8.0 in c:\exllamav2\venv\lib\site-packages (from torch>=2.2.0->exllamav2==0.1.8) (4.9.0) Requirement already satisfied: sympy in c:\exllamav2\venv\lib\site-packages (from torch>=2.2.0->exllamav2==0.1.8) (1.12) Requirement already satisfied: networkx in c:\exllamav2\venv\lib\site-packages (from torch>=2.2.0->exllamav2==0.1.8) (3.2.1) Requirement already satisfied: jinja2 in c:\exllamav2\venv\lib\site-packages (from torch>=2.2.0->exllamav2==0.1.8) (3.1.3) Requirement already satisfied: fsspec in c:\exllamav2\venv\lib\site-packages (from torch>=2.2.0->exllamav2==0.1.8) (2024.2.0) Requirement already satisfied: cramjam>=2.3 in c:\exllamav2\venv\lib\site-packages (from fastparquet->exllamav2==0.1.8) (2.8.3) Requirement already satisfied: packaging in c:\exllamav2\venv\lib\site-packages (from fastparquet->exllamav2==0.1.8) (24.1) Requirement already satisfied: python-dateutil>=2.8.2 in c:\exllamav2\venv\lib\site-packages (from pandas->exllamav2==0.1.8) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in c:\exllamav2\venv\lib\site-packages (from pandas->exllamav2==0.1.8) (2024.1) Requirement already satisfied: tzdata>=2022.7 in c:\exllamav2\venv\lib\site-packages (from pandas->exllamav2==0.1.8) (2024.1) Requirement already satisfied: markdown-it-py>=2.2.0 in c:\exllamav2\venv\lib\site-packages (from rich->exllamav2==0.1.8) (3.0.0) Requirement already satisfied: mdurl~=0.1 in c:\exllamav2\venv\lib\site-packages (from markdown-it-py>=2.2.0->rich->exllamav2==0.1.8) (0.1.2) Requirement already satisfied: six>=1.5 in c:\exllamav2\venv\lib\site-packages (from python-dateutil>=2.8.2->pandas->exllamav2==0.1.8) (1.16.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\exllamav2\venv\lib\site-packages (from jinja2->torch>=2.2.0->exllamav2==0.1.8) (2.1.5) Requirement already satisfied: mpmath>=0.19 in c:\exllamav2\venv\lib\site-packages (from sympy->torch>=2.2.0->exllamav2==0.1.8) (1.3.0) Building wheels for collected packages: exllamav2 Building wheel for exllamav2 (setup.py) ... error error: subprocess-exited-with-error × python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [606 lines of output] Version: 0.1.8 warning: no previously-included files matching '*.pyc' found anywhere in distribution warning: no previously-included files matching 'dni_*' found anywhere in distribution C:\exllamav2\venv\lib\site-packages\setuptools\command\build_py.py:153: SetuptoolsDeprecationWarning: Installing 'exllamav2.exllamav2_ext' as data is deprecated, please list it in `packages`. !! ############################ # Package would be ignored # ############################ Python recognizes 'exllamav2.exllamav2_ext' as an importable package, but it is not listed in the `packages` configuration of setuptools. 'exllamav2.exllamav2_ext' has been automatically added to the distribution only because it may contain data files, but this behavior is likely to change in future versions of setuptools (and therefore is considered deprecated). Please make sure that 'exllamav2.exllamav2_ext' is included as a package by using the `packages` configuration field or the proper discovery methods (for example by using `find_namespace_packages(...)`/`find_namespace:` instead of `find_packages(...)`/`find:`). You can read more about "package discovery" and "data files" on setuptools documentation page. !! check.warn(importable) C:\exllamav2\venv\lib\site-packages\setuptools\command\build_py.py:153: SetuptoolsDeprecationWarning: Installing 'exllamav2.exllamav2_ext.cpp' as data is deprecated, please list it in `packages`. !! ...(truncate)... check.warn(importable) C:\exllamav2\venv\lib\site-packages\setuptools\command\build_py.py:153: SetuptoolsDeprecationWarning: Installing 'exllamav2.tokenizer' as data is deprecated, please list it in `packages`. !! ############################ # Package would be ignored # ############################ Python recognizes 'exllamav2.tokenizer' as an importable package, but it is not listed in the `packages` configuration of setuptools. 'exllamav2.tokenizer' has been automatically added to the distribution only because it may contain data files, but this behavior is likely to change in future versions of setuptools (and therefore is considered deprecated). Please make sure that 'exllamav2.tokenizer' is included as a package by using the `packages` configuration field or the proper discovery methods (for example by using `find_namespace_packages(...)`/`find_namespace:` instead of `find_packages(...)`/`find:`). You can read more about "package discovery" and "data files" on setuptools documentation page. !! check.warn(importable) C:\exllamav2\venv\lib\site-packages\torch\utils\cpp_extension.py:1965: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Emitting ninja build file C:\exllamav2\build\temp.win-amd64-cpython-310\Release\build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/48] cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -IC:\exllamav2\venv\lib\site-packages\torch\include -IC:\exllamav2\venv\lib\site-packages\torch\include\torch\csrc\api\include -IC:\exllamav2\venv\lib\site-packages\torch\include\TH -IC:\exllamav2\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -IC:\exllamav2\venv\include -IC:\Python\Python310\include -IC:\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\exllamav2\exllamav2\exllamav2_ext\cpp\profiling.cpp /FoC:\exllamav2\build\temp.win-amd64-cpython-310\Release\exllamav2/exllamav2_ext/cpp/profiling.obj /Ox -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 [2/48] cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -IC:\exllamav2\venv\lib\site-packages\torch\include -IC:\exllamav2\venv\lib\site-packages\torch\include\torch\csrc\api\include -IC:\exllamav2\venv\lib\site-packages\torch\include\TH -IC:\exllamav2\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -IC:\exllamav2\venv\include -IC:\Python\Python310\include -IC:\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling_avx2.cpp /FoC:\exllamav2\build\temp.win-amd64-cpython-310\Release\exllamav2/exllamav2_ext/cpp/sampling_avx2.obj /Ox -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(94): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(95): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(96): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(97): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(98): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(99): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(100): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(101): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(102): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(103): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(104): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(283): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(285): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(287): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(288): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(289): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(290): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(291): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(292): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(395): warning C4305: 'argument': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(396): warning C4305: 'argument': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(397): warning C4305: 'argument': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(398): warning C4305: 'argument': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(399): warning C4305: 'argument': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(400): warning C4305: 'argument': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(441): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(442): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(443): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(444): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(445): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(446): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(447): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\avx_mathfun.h(448): warning C4305: 'initializing': truncation from 'double' to 'const float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling_avx2.cpp(31): warning C4305: 'initializing': truncation from 'double' to 'float' [3/48] cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -IC:\exllamav2\venv\lib\site-packages\torch\include -IC:\exllamav2\venv\lib\site-packages\torch\include\torch\csrc\api\include -IC:\exllamav2\venv\lib\site-packages\torch\include\TH -IC:\exllamav2\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -IC:\exllamav2\venv\include -IC:\Python\Python310\include -IC:\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling.cpp /FoC:\exllamav2\build\temp.win-amd64-cpython-310\Release\exllamav2/exllamav2_ext/cpp/sampling.obj /Ox -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling.cpp(127): warning C4305: 'initializing': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling.cpp(406): warning C4305: 'initializing': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling.cpp(466): warning C4305: 'initializing': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling.cpp(533): warning C4305: 'initializing': truncation from 'double' to 'float' C:\exllamav2\exllamav2\exllamav2_ext\cpp\sampling.cpp(755): warning C4305: 'initializing': truncation from 'double' to 'float' [4/48] cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -IC:\exllamav2\venv\lib\site-packages\torch\include -IC:\exllamav2\venv\lib\site-packages\torch\include\torch\csrc\api\include -IC:\exllamav2\venv\lib\site-packages\torch\include\TH -IC:\exllamav2\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -IC:\exllamav2\venv\include -IC:\Python\Python310\include -IC:\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\exllamav2\exllamav2\exllamav2_ext\cpp\generator.cpp /FoC:\exllamav2\build\temp.win-amd64-cpython-310\Release\exllamav2/exllamav2_ext/cpp/generator.obj /Ox -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 h_add.cu ...(truncate)... -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\crt/host_config.h(153): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. ...(truncate)... /ext_bindings.obj /Ox -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=exllamav2_ext -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 ninja: build stopped: subcommand failed. Traceback (most recent call last): File "C:\exllamav2\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2105, in _run_ninja_build subprocess.run( File "C:\Python\Python310\lib\subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "", line 2, in File "", line 34, in File "C:\exllamav2\setup.py", line 97, in setup( File "C:\exllamav2\venv\lib\site-packages\setuptools\__init__.py", line 87, in setup return distutils.core.setup(**attrs) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\core.py", line 177, in setup return run_commands(dist) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\core.py", line 193, in run_commands dist.run_commands() File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\dist.py", line 968, in run_commands self.run_command(cmd) File "C:\exllamav2\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command super().run_command(command) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\dist.py", line 987, in run_command cmd_obj.run() File "C:\exllamav2\venv\lib\site-packages\wheel\_bdist_wheel.py", line 378, in run self.run_command("build") File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\cmd.py", line 317, in run_command self.distribution.run_command(command) File "C:\exllamav2\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command super().run_command(command) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\dist.py", line 987, in run_command cmd_obj.run() File "C:\exllamav2\venv\lib\site-packages\setuptools\command\build.py", line 24, in run super().run() File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\command\build.py", line 131, in run self.run_command(cmd_name) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\cmd.py", line 317, in run_command self.distribution.run_command(command) File "C:\exllamav2\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command super().run_command(command) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\dist.py", line 987, in run_command cmd_obj.run() File "C:\exllamav2\venv\lib\site-packages\setuptools\command\build_ext.py", line 79, in run _build_ext.run(self) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 339, in run self.build_extensions() File "C:\exllamav2\venv\lib\site-packages\torch\utils\cpp_extension.py", line 866, in build_extensions build_ext.build_extensions(self) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 459, in build_extensions self._build_extensions_serial() File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 485, in _build_extensions_serial self.build_extension(ext) File "C:\exllamav2\venv\lib\site-packages\setuptools\command\build_ext.py", line 202, in build_extension _build_ext.build_extension(self, ext) File "C:\exllamav2\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 540, in build_extension objects = self.compiler.compile( File "C:\exllamav2\venv\lib\site-packages\torch\utils\cpp_extension.py", line 838, in win_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "C:\exllamav2\venv\lib\site-packages\torch\utils\cpp_extension.py", line 1785, in _write_ninja_file_and_compile_objects _run_ninja_build( File "C:\exllamav2\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2121, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for exllamav2 Running setup.py clean for exllamav2 Failed to build exllamav2 ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (exllamav2) ```

System info:

turboderp commented 1 month ago

Microsoft broke compatibility at one point, and there's really nothing I can do about it from my end. You'll want to either update CUDA or downgrade Visual Studio.

JulienMaille commented 1 month ago

Microsoft broke compatibility at one point, and there's really nothing I can do about it from my end. You'll want to either update CUDA or downgrade Visual Studio.

This is also my conclusion but is it documented somewhere?