aarch64 (SVE): svdup_s8 followed by svinsr of the same value could be optimized

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

Other

27.96k stars 11.53k forks source link

While looking into some GCC auto-vect code generation (https://gcc.gnu.org/PR116075) I noticed that both GCC and LLVM does not optimize this testcase: https://godbolt.org/z/7KTbhMxcs

#include <arm_sve.h>

svint8_t f(void)
{
  svint8_t tt;
  tt = svdup_s8 (0);
  tt = svinsr (tt, 0);
  return tt;
}

The fix for GCC, I have in mind for the GCC auto-vectorization issue will fix the above testcase so I thought I would file it here also. Note also the value 0 does not need to be a constant but both values passed to svdup and svinsr need to be the same.

llvm / llvm-project

aarch64 (SVE): svdup_s8 followed by svinsr of the same value could be optimized #100497