ROCm / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.
https://facebookresearch.github.io/xformers/
Other
19 stars 7 forks source link

Allow PYTORCH_ROCM_ARCH to select GPU targets. #1

Closed sfantao closed 1 month ago

sfantao commented 7 months ago

Enable target selection done similarly with pytorch and also include the option FORCE_ROCM. This allows the package to be build in machines without GPUs.

sfantao commented 7 months ago

I see some linter errors, but I believe they do not originate in my changes. Please, advise if I should adjust anything in my PR.

Looong01 commented 5 months ago

@tenpercent

I have a server with AMD Radeon RX 7900 XTX and 6700XT, Ubuntu 22.04, ROCm 6.0.2

I want to install xformers on my server. So I did the following steps:

1.  git clone https://github.com/ROCm/xformers
2.  cd xformers
3.  conda create -n xformers python=3.9
4.  conda activate xformers
5.  pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.7
6.  pip install ./

Then it failed.

Could you please help with this?

I put up some screenshots that I think it is important: image image image image image image image

Looong01 commented 5 months ago

The last part of the error message is:

      fatal error: too many errors emitted, stopping now [-ferror-limit=]
      20 errors generated when compiling for gfx1100.
      [21/156] /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_64.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_64.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      FAILED: /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_64.o
      /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_64.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_64.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_64.hip:10:
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/ck_tiled_fmha_batched_forward_hip.h:23:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/block_tile_pipeline/block_fmha_pipeline_qr_ks_vs_hip.hpp:18:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_hip.hpp:10:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_hip.hpp:13:
      /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_impl_hip.hpp:116:17: error: '__builtin_amdgcn_mfma_f32_32x32x8bf16_1k' needs target feature mai-insts
              c_vec = __builtin_amdgcn_mfma_f32_32x32x8bf16_1k(a_vec, b_vec, c_vec, 0, 0, 0);
                      ^
      1 error generated when compiling for gfx1100.
      [22/156] /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_32.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_32.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      FAILED: /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_32.o
      /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_32.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_32.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_32.hip:10:
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/ck_tiled_fmha_batched_forward_hip.h:23:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/block_tile_pipeline/block_fmha_pipeline_qr_ks_vs_hip.hpp:18:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_hip.hpp:10:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_hip.hpp:13:
      /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_impl_hip.hpp:116:17: error: '__builtin_amdgcn_mfma_f32_32x32x8bf16_1k' needs target feature mai-insts
              c_vec = __builtin_amdgcn_mfma_f32_32x32x8bf16_1k(a_vec, b_vec, c_vec, 0, 0, 0);
                      ^
      1 error generated when compiling for gfx1100.
      [23/156] /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_128.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_128.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      FAILED: /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_128.o
      /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_128.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_128.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_no_attnbias_maxk_128.hip:10:
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/ck_tiled_fmha_batched_forward_hip.h:23:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/block_tile_pipeline/block_fmha_pipeline_qr_ks_vs_hip.hpp:18:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_hip.hpp:10:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_hip.hpp:13:
      /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_impl_hip.hpp:116:17: error: '__builtin_amdgcn_mfma_f32_32x32x8bf16_1k' needs target feature mai-insts
              c_vec = __builtin_amdgcn_mfma_f32_32x32x8bf16_1k(a_vec, b_vec, c_vec, 0, 0, 0);
                      ^
      1 error generated when compiling for gfx1100.
      [24/156] /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_with_attnbias_maxk_128.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_with_attnbias_maxk_128.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      FAILED: /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_with_attnbias_maxk_128.o
      /opt/rocm-6.0.2/bin/hipcc  -I/home/loong/Downloads/xformers/xformers/csrc -I/home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/example/91_tile_program/xformers_fmha -I/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/TH -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THC -I/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/include/THH -I/opt/rocm-6.0.2/include -I/home/loong/miniconda3/envs/py39/include/python3.9 -c -c /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_with_attnbias_maxk_128.hip -o /home/loong/Downloads/xformers/build/temp.linux-x86_64-cpython-39/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_with_attnbias_maxk_128.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 --offload-arch=native -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -DCK_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -Werror -Woverloaded-virtual -DBUILD_PYTHON_PACKAGE -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/instances/ck_tiled_fmha_batched_forward_bp16_no_causalmask_with_attnbias_maxk_128.hip:10:
      In file included from /home/loong/Downloads/xformers/xformers/csrc/attention/hip_fmha/ck_tiled_fmha_batched_forward_hip.h:23:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/block_tile_pipeline/block_fmha_pipeline_qr_ks_vs_hip.hpp:18:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_hip.hpp:10:
      In file included from /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_hip.hpp:13:
      /home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_impl_hip.hpp:116:17: error: '__builtin_amdgcn_mfma_f32_32x32x8bf16_1k' needs target feature mai-insts
              c_vec = __builtin_amdgcn_mfma_f32_32x32x8bf16_1k(a_vec, b_vec, c_vec, 0, 0, 0);
                      ^
      1 error generated when compiling for gfx1100.
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 2096, in _run_ninja_build
          subprocess.run(
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/subprocess.py", line 528, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/home/loong/Downloads/xformers/setup.py", line 485, in <module>
          setuptools.setup(
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/wheel/bdist_wheel.py", line 364, in run
          self.run_command("build")
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/home/loong/Downloads/xformers/setup.py", line 442, in build_extensions
          super().build_extensions()
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 871, in build_extensions
          build_ext.build_extensions(self)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
          self._build_extensions_serial()
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
          self.build_extension(ext)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 249, in build_extension
          _build_ext.build_extension(self, ext)
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
          objects = self.compiler.compile(
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 684, in unix_wrap_ninja_compile
          _write_ninja_file_and_compile_objects(
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
          _run_ninja_build(
        File "/home/loong/miniconda3/envs/py39/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 2112, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for xformers
  Running setup.py clean for xformers
Failed to build xformers
ERROR: Could not build wheels for xformers, which is required to install pyproject.toml-based projects
Looong01 commented 5 months ago

image image image image image image image @tenpercent @qianfengz

tenpercent commented 5 months ago

/home/loong/Downloads/xformers/third_party/composable_kernel_tiled/include/ck/tile_program/warp_tile/warp_gemm_attribute_mfma_impl_hip.hpp:116:17: error: 'builtin_amdgcn_mfma_f32_32x32x8bf16_1k' needs target feature mai-insts c_vec = builtin_amdgcn_mfma_f32_32x32x8bf16_1k(a_vec, b_vec, c_vec, 0, 0, 0); ^ 1 error generated when compiling for gfx1100. ninja: build stopped: subcommand failed.

Hi @Looong01! Thanks for reporting this! For the time being, you would need an MI series GPU for the cpp extension to compile, due to the compiler intrinsics for matrix multiplication which are invalid on Navi series cc @zjing14 @carlushuang

tenpercent commented 1 month ago

We have a way to override architectures list in setup.py now, so let's close it