Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Missed store elimination on conditional change #47084

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR48115
Status NEW
Importance P enhancement
Reported by Daan Sprenkels (daan@dsprenkels.com)
Reported on 2020-11-09 01:49:35 -0800
Last modified on 2020-11-09 05:48:08 -0800
Version trunk
Hardware PC All
CC craig.topper@gmail.com, florian_hahn@apple.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also PR47887, PR41145

The following IR:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @cmov(i64* nocapture align 8 dereferenceable(8) %a, i64 %b, i1 zeroext %cond) unnamed_addr #0 {
  %_9 = load i64, i64* %a, align 8
  %1 = select i1 %cond, i64 %b, i64 %_9
  store i64 %1, i64* %a, align 8
  ret void
}

is compiled to the following X86:

cmov:                                   # @cmov
        test    edx, edx
        jne     .LBB0_2
        mov     rsi, qword ptr [rdi]
.LBB0_2:
        mov     qword ptr [rdi], rsi    # (*) could be eliminated
        ret

This assembly snippet has a store that could be eliminated.

Godbolt: https://godbolt.org/z/Yz64eh

Quuxplusone commented 3 years ago
(In reply to Daan Sprenkels from comment #0)
> This assembly snippet has a store that could be eliminated.

To be clear: The store operation is only needed when %cond is 1. When %cond is
0, it can be moved to one line earlier, before the .LBB0_2 label.
Quuxplusone commented 3 years ago

Note that the branches are re-intoruced in the backend by https://github.com/llvm/llvm-project/blob/master/llvm/lib/Target/X86/X86CmovConversion.cpp which determines this to be more profitable than the conditional move.

At this point, DSE is not run again and I do not think we have a version for the backend.

For AArch64, csel is generated (https://godbolt.org/z/sbGEKo), which is probably preferred in practice over an additional conditional branch.