llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.81k stars 11.45k forks source link

[AArch64] The option -alu-lsl-fast don't split the x * 3 into a shift for vector #94572

Open vfdff opened 3 months ago

vfdff commented 3 months ago
llvmbot commented 3 months ago

@llvm/issue-subscribers-backend-aarch64

Author: Allen (vfdff)

* test: https://gcc.godbolt.org/z/jW43j3bfn ``` void foo (int * __restrict a, int * b, int N) { for (int i = 0; i < N; ++i) { a[4*i + 0] = b[4*i + 0] * 3; a[4*i + 1] = b[4*i + 1] + 3; a[4*i + 2] = (b[4*i + 2] * 3 + 3); a[4*i + 3] = b[4*i + 3] * 3; } } ``` * gcc use **shl** to replace the `x * 3`, the kernel body of gcc; while clang don't even with **-Xclang -target-feature -Xclang +alu-lsl-fast** ``` .L4: ld4 {v28.4s - v31.4s}, [x3], 64 add v0.4s, v24.4s, v30.4s shl v26.4s, v28.4s, 1 add v27.4s, v25.4s, v29.4s shl v29.4s, v31.4s, 1 add v26.4s, v26.4s, v28.4s shl v28.4s, v0.4s, 1 add v29.4s, v29.4s, v31.4s add v28.4s, v28.4s, v0.4s st4 {v26.4s - v29.4s}, [x4], 64 cmp x5, x3 bne .L4 and w7, w2, -4 cmp w2, w7 beq .L1 ```
vfdff commented 3 months ago

It works fine with scalar type, https://gcc.godbolt.org/z/M1x5MeP5b