We are generating a lot of code with Clang for a loop that contains an if-then statement resulting in predicated instructions, which don't seem to be necessary looking at GCC's codegen. For this kernel s124 in TSVC, we are about 60% behind.
Compile this input with -O3 -mcpu=neoverse-v2 -ffast-math:
We are generating a lot of code with Clang for a loop that contains an if-then statement resulting in predicated instructions, which don't seem to be necessary looking at GCC's codegen. For this kernel s124 in TSVC, we are about 60% behind.
Compile this input with
-O3 -mcpu=neoverse-v2 -ffast-math
:Clang's codegen:
vs. GCC's codegen:
See also: https://godbolt.org/z/nb6xYxxKo