Open davidbolvansky opened 3 years ago
Is this a report of a missed opportunity or is this an unexpected behavior (what's not supposed to happen)?
Why do we want to avoid __lshrti3 for lshr if its cold?
for ashr, the code without libcall is actually smaller.
d(): # @d() push rax movsxd rdi, dword ptr [rip + a] mov rsi, rdi sar rsi, 63 movzx edx, byte ptr [rip + b] call __ashlti3 mov dword ptr [rip + c], eax pop rax ret
d(): # @d() movsxd rax, dword ptr [rip + a] mov cl, byte ptr [rip + b] shl rax, cl xor edx, edx test cl, 64 cmove rdx, rax mov dword ptr [rip + c], edx ret
Extended Description
int a, b, c; attribute((cold)) void d() { c = (unsigned __int128)a >> b; }
Clang -O3: d(): # @d() push rax movsxd rdi, dword ptr [rip + a] mov rsi, rdi sar rsi, 63 movzx edx, byte ptr [rip + b] call __lshrti3 mov dword ptr [rip + c], eax pop rax ret a: .long 0 # 0x0
b: .long 0 # 0x0
c: .long 0 # 0x0
While Clang 11/GCC generates: d(): # @d() movsxd rax, dword ptr [rip + a] mov rdx, rax sar rdx, 63 mov cl, byte ptr [rip + b] shrd rax, rdx, cl shr rdx, cl test cl, 64 cmove rdx, rax mov dword ptr [rip + c], edx ret a: .long 0 # 0x0
b: .long 0 # 0x0
c: .long 0 # 0x0
https://godbolt.org/z/Kqf3EW