llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.27k stars 11.67k forks source link

[X86] VNNI intrinsics argument types don't match the actual computation #97271

Open RKSimon opened 3 months ago

RKSimon commented 3 months ago

For example: __m128i _mm_dpbusd_avx_epi32 (__m128i src, __m128i a, __m128i b)

This takes 1 x <4 x i32> "src" and 2 x <16 x i8> "a * b" multiplication inputs but the clang/llvm intrinsics are defined as:

TARGET_BUILTIN(__builtin_ia32_vpdpbusd128, "V4iV4iV4iV4i", "ncV:128:", "avx512vl,avx512vnni|avxvnni")

  def int_x86_avx512_vpdpbusd_128 :
      ClangBuiltin<"__builtin_ia32_vpdpbusd128">,
      DefaultAttrsIntrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, llvm_v4i32_ty,
                             llvm_v4i32_ty], [IntrNoMem]>;

which means we require hardcoded mappings of the src/dst types for any combines that involve them.

llvmbot commented 3 months ago

@llvm/issue-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

For example: `__m128i _mm_dpbusd_avx_epi32 (__m128i src, __m128i a, __m128i b)` This takes 1 x <4 x i32> "src" and 2 x <16 x i8> "a * b" multiplication inputs but the clang/llvm intrinsics are defined as: ``` TARGET_BUILTIN(__builtin_ia32_vpdpbusd128, "V4iV4iV4iV4i", "ncV:128:", "avx512vl,avx512vnni|avxvnni") def int_x86_avx512_vpdpbusd_128 : ClangBuiltin<"__builtin_ia32_vpdpbusd128">, DefaultAttrsIntrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, llvm_v4i32_ty, llvm_v4i32_ty], [IntrNoMem]>; ``` which means we require hardcoded mappings of the src/dst types for any combines that involve them.