llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.13k stars 12.01k forks source link

[AArch64] Missed CCMP opportunity #26458

Open llvmbot opened 8 years ago

llvmbot commented 8 years ago
Bugzilla Link 26084
Version trunk
OS Windows NT
Reporter LLVM Bugzilla Contributor
CC @DMG862,@efriedma-quic,@fhahn,@MatzeB,@sjoerdmeijer

Extended Description

Test case: int test(int A, int B) { return !(A > 0 && B > 0); }

When targeting AArch64 at -O3 we currently generate the following assembly: test: cmp w0, #​1 cset w8, lt cmp w1, #​1 cset w9, lt orr w0, w8, w9 ret

We should be able to generate something like this: test: cmp w0, wzr ccmp w1, wzr, 4, gt cset w0, le ret

which removes 1 cset and 1 orr.

efriedma-quic commented 4 years ago

The patch in question was reverted, and apparently nobody has looked into this since.

I briefly tried rebasing the patch. It basically works, but the "custom" markings have unintended consequences, and as noted in the revert, the DAGCombine changes are suspect. (One of those consequences is suppressing add->or conversion after legalization; filed bug 46151 and bug 46152.)

I don't think marking integer OR for custom lowering is an appropriate solution; there are too many potential weird consequences in terms of suppressing target-independent DAGCombines. Probably should be a DAGCombine itself. (Not sure if it would run before or after legalization. Before, checking profitability is hard. After, we have to write some new code to dig into target-specific compare nodes.)

llvmbot commented 8 years ago

fixed in r259387.

MatzeB commented 8 years ago

Looking at the ir I see: %notlhs = icmp slt i32 %A, 1 %notrhs = icmp slt i32 %B, 1 %lnot = or i1 %notlhs, %notrhs %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext

and:

t0: ch = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0 t7: i1 = setcc t2, Constant:i32<1>, setlt:ch t4: i32,ch = CopyFromReg t0, Register:i32 %vreg1 t8: i1 = setcc t4, Constant:i32<1>, setlt:ch t9: i1 = or t7, t8 t10: i32 = zero_extend t9 t12: ch,glue = CopyToReg t0, Register:i32 %W0, t10 t13: ch = AArch64ISD::RET_FLAG t12, Register:i32 %W0, t12:1

in SelectionDAG. I didn't check in detail, but from what I remember the patterns I added needed to start on a cset, or select zero_extend/or is probably not matched.

llvmbot commented 8 years ago

Matthias, if you happen to know why this isn't covered by r242436 I'd be interested to know.