The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
28.75k
stars
11.89k
forks
source link
[Isel Aarch64] extra instruction (i256) or 2 instructions (i320) when chaining icmp and select based on underflow #103855
Open
mratsim opened 2 months ago
Same IR as https://github.com/llvm/llvm-project/issues/103841 but applied to Aarch64 as an alternative to https://github.com/llvm/llvm-project/issues/103717
Unlike x86 there is always an extra instruction even for i256, and there are 2 unnecessary instruction for i320 or i384.
https://alive2.llvm.org/ce/z/-bGiUs
Full code
Original IR
After opt -O3
Assembly
Analysis
With i256, the
cmp
is useless in this sequenceas demonstrated by https://github.com/llvm/llvm-project/issues/103717
With i320, similar to x86 https://github.com/llvm/llvm-project/issues/103841, there is another additional
asr
instruction