Open sjoerdmeijer opened 8 months ago
@llvm/issue-subscribers-backend-aarch64
Author: Sjoerd Meijer (sjoerdmeijer)
This would need interchanging the loops so the memory accesses are consecutive in the inner loop
(this is also independent of AArch64)
We are a lot behind (300%) for kernel s2275 in TSVC compared to GCC12.
Compile this input with
-O3 -mcpu=neoverse-v2 -ffast-math
:Clang's codegen:
vs. GCC's codegen:
See also: https://godbolt.org/z/8E3fexn5o
TODO: Root cause analysis.