Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[PowerPC] icmp ne 0 can be implemented using the carry bit #31492

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR32520
Status NEW
Importance P enhancement
Reported by Sanjay Patel (spatel+llvm@rotateright.com)
Reported on 2017-04-04 06:47:51 -0700
Last modified on 2017-05-23 19:51:12 -0700
Version trunk
Hardware PC All
CC hfinkel@anl.gov, kit.barton@gmail.com, llvm-bugs@lists.llvm.org, nemanja.i.ibm@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
I noticed this opportunity in https://reviews.llvm.org/D31483.

This is from "Optimal Code Sequences" P. 200 of the Compiler Writer's Guide:

ne: not equal to r = v0 != v1;
subf R5,R3,R4
addic R6,R5,-1
subfe R7,R6,R5

define zeroext i1 @ne0(i32 %x) {
  %cmp = icmp ne i32 %x, 0
  ret i1 %cmp
}

Currently (r299396), the PPC backend seems to choose a cntlzw variant for all
compares against zero:

$ ./llc ne0.ll -o - -mtriple=powerpc64
    cntlzw 3, 3
    nor 3, 3, 3  <-- extended mnemonic for "not" would be nicer!
    rlwinm 3, 3, 27, 31, 31

I don't know anything about recent PPC uarch, so this may not be faster, but it
is smaller:

addic 4, 3, -1
subfe 3, 4, 3
Quuxplusone commented 7 years ago
This is in the pipeline:
https://reviews.llvm.org/D31240
Quuxplusone commented 7 years ago
(In reply to Nemanja Ivanovic from comment #1)
> This is in the pipeline:
> https://reviews.llvm.org/D31240

Nice! And wow...that's a big patch. :)