llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.81k stars 10.98k forks source link

Avoid generation of __lshrti3 libcall for cold code #47241

Open davidbolvansky opened 3 years ago

davidbolvansky commented 3 years ago
Bugzilla Link 47897
Version trunk
OS Linux
CC @topperc,@RKSimon,@phoebewang,@rotateright,@hjyamauchi

Extended Description

int a, b, c; attribute((cold)) void d() { c = (unsigned __int128)a >> b; }

Clang -O3: d(): # @​d() push rax movsxd rdi, dword ptr [rip + a] mov rsi, rdi sar rsi, 63 movzx edx, byte ptr [rip + b] call __lshrti3 mov dword ptr [rip + c], eax pop rax ret a: .long 0 # 0x0

b: .long 0 # 0x0

c: .long 0 # 0x0

While Clang 11/GCC generates: d(): # @​d() movsxd rax, dword ptr [rip + a] mov rdx, rax sar rdx, 63 mov cl, byte ptr [rip + b] shrd rax, rdx, cl shr rdx, cl test cl, 64 cmove rdx, rax mov dword ptr [rip + c], edx ret a: .long 0 # 0x0

b: .long 0 # 0x0

c: .long 0 # 0x0

https://godbolt.org/z/Kqf3EW

hjyamauchi commented 3 years ago

Is this a report of a missed opportunity or is this an unexpected behavior (what's not supposed to happen)?

topperc commented 3 years ago

Why do we want to avoid __lshrti3 for lshr if its cold?

davidbolvansky commented 3 years ago

for ashr, the code without libcall is actually smaller.

d(): # @​d() push rax movsxd rdi, dword ptr [rip + a] mov rsi, rdi sar rsi, 63 movzx edx, byte ptr [rip + b] call __ashlti3 mov dword ptr [rip + c], eax pop rax ret

d(): # @​d() movsxd rax, dword ptr [rip + a] mov cl, byte ptr [rip + b] shl rax, cl xor edx, edx test cl, 64 cmove rdx, rax mov dword ptr [rip + c], edx ret