Closed linshuijin closed 1 year ago
@linshuijin 编译环境的GPU型号和cuda版本是多少呢?试试TORCH_CUDA_ARCH_LIST="8.0+PTX" pip install .
@linshuijin 编译环境的GPU型号和cuda版本是多少呢?试试TORCH_CUDA_ARCH_LIST="8.0+PTX" pip install .
GPU是A100 NVIDIA-SMI 525.85.05 Driver Version: 525.85.05 CUDA Version: 12.0
@linshuijin 感谢你的测试,是多版本编译导致的错误,指定架构TORCH_CUDA_ARCH_LIST="8.0+PTX"编译通过,我会更新到readme中
TORCH_CUDA_ARCH_LIST="8.0+PTX" 嗯,不指定架构编译很难成功,下面是我编译成功的的命令,供参考 MAX_JOBS=4 TORCH_CUDA_ARCH_LIST="8.0+PTX" python setup.py install
按照文档步骤在docker下安装出现如下错误: creating /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/utils creating /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/embedding_kernels creating /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels Emitting ninja build file /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/9] c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.cc -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 FAILED: /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.o c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.cc -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_heuristic.cc:18:10: fatal error: cutlass/gemm/gemm.h: No such file or directory 18 | #include "cutlass/gemm/gemm.h" | ^
~~~~compilation terminated. [2/9] c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.cc -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 FAILED: /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.o c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.cc -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 In file included from /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/cutlass_preprocessors.cc:18: /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include/cutlass_extensions/gemm/kernel/mixed_gemm_B_layout.h:12:10: fatal error: cutlass/layout/matrix.h: No such file or directory 12 | #include "cutlass/layout/matrix.h" | ^~~~~~~~~ compilation terminated. [3/9] /usr/local/cuda/bin/nvcc -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.cu -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -UCUDA_NO_HALF2_CONVERSIONS -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 FAILED: /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.o /usr/local/cuda/bin/nvcc -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.cu -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -UCUDA_NO_HALF2_CONVERSIONS -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 In file included from /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.cu:1: /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.h:7:10: fatal error: cutlass/numeric_types.h: No such file or directory 7 | #include "cutlass/numeric_types.h" | ^~~~~~~~~ compilation terminated. [4/9] c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/utils/cuda_utils.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/utils/cuda_utils.cc -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/utils/cuda_utils.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 [5/9] c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/utils/logger.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/utils/logger.cc -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/utils/logger.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 [6/9] /usr/local/cuda/bin/nvcc -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm_wrapper.cu -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm_wrapper.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -UCUDA_NO_HALF2_CONVERSIONS -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 FAILED: /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm_wrapper.o /usr/local/cuda/bin/nvcc -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm_wrapper.cu -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm_wrapper.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -UCUDA_NO_HALF2_CONVERSIONS -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 In file included from /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm_wrapper.cu:7: /home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/fpA_intB_gemm.h:7:10: fatal error: cutlass/numeric_types.h: No such file or directory 7 | #include "cutlass/numeric_types.h" | ^~~~~~~~~ compilation terminated. [7/9] c++ -MMD -MF /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/eetpy.o.d -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/eetpy.cpp -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/eetpy.o -g -std=c++17 -O3 -fopenmp -lgomp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 [8/9] /usr/local/cuda/bin/nvcc -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels/layernorm.cu -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels/layernorm.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -UCUDA_NO_HALF2_CONVERSIONS -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 FAILED: /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels/layernorm.o /usr/local/cuda/bin/nvcc -DVERSION_INFO=1.0.0-beta.0 '-I['"'"'/usr/local/lib/python3.8/dist-packages/torch/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/TH'"'"', '"'"'/usr/local/lib/python3.8/dist-packages/torch/include/THC'"'"', '"'"'/usr/local/cuda/include'"'"']' -I/home/shuijin.lin/w8a16/EETQ/csrc -I/home/shuijin.lin/w8a16/EETQ/csrc/utils -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_kernels/include -I/home/shuijin.lin/w8a16/EETQ/csrc/cutlass_extensions/include -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.8 -c -c /home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels/layernorm.cu -o /home/shuijin.lin/w8a16/EETQ/build/temp.linux-x86_64-3.8/home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels/layernorm.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -UCUDA_NO_HALF2_CONVERSIONS -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EETQ -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 /home/shuijin.lin/w8a16/EETQ/csrc/layernorm_kernels/reduction.cuh(30): error: identifier "__hadd2" is undefinednote: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for EETQ Running setup.py clean for EETQ Failed to build EETQ ERROR: Could not build wheels for EETQ, which is required to install pyproject.toml-based projects