Open sjoerdmeijer opened 12 months ago
@llvm/issue-subscribers-backend-aarch64
Author: Sjoerd Meijer (sjoerdmeijer)
This can now be closed after #95819. Current codegen:(https://godbolt.org/z/o6a8zM3E4):
LBB0_2: // Parent Loop BB0_1 Depth=1
ldp q0, q1, [x10]
ldp q2, q3, [x8, #-32]
subs x9, x9, #16
fadd v0.4s, v2.4s, v0.4s
str q0, [x10, #64000]
fadd v0.4s, v3.4s, v1.4s
str q0, [x10, #64016]
ldp q0, q1, [x10, #32]
ldp q2, q3, [x8], #64
fadd v1.4s, v3.4s, v1.4s
fadd v0.4s, v2.4s, v0.4s
str q1, [x10, #64048]
str q0, [x10, #64032]
add x10, x10, #64
b.ne .LBB0_2
We are behind a lot compared to GCC. Compile this input with
-O3 -mcpu=neoverse-v2 -ffast-math
:Clang's codegen:
vs. GCC's codegen:
See also: https://godbolt.org/z/9zs65h3aq
Might be caused by the same underlying issue as: https://github.com/llvm/llvm-project/issues/71524