[PowerPC] icmp ne 0 can be implemented using the carry bit

Quuxplusone commented 7 years ago


Bugzilla Link	PR32520
Status	NEW
Importance	P enhancement
Reported by	Sanjay Patel (spatel+llvm@rotateright.com)
Reported on	2017-04-04 06:47:51 -0700
Last modified on	2017-05-23 19:51:12 -0700
Version	trunk
Hardware	PC All
CC	hfinkel@anl.gov, kit.barton@gmail.com, llvm-bugs@lists.llvm.org, nemanja.i.ibm@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

I noticed this opportunity in https://reviews.llvm.org/D31483.

This is from "Optimal Code Sequences" P. 200 of the Compiler Writer's Guide:

ne: not equal to r = v0 != v1;
subf R5,R3,R4
addic R6,R5,-1
subfe R7,R6,R5

define zeroext i1 @ne0(i32 %x) {
  %cmp = icmp ne i32 %x, 0
  ret i1 %cmp
}

Currently (r299396), the PPC backend seems to choose a cntlzw variant for all
compares against zero:

$ ./llc ne0.ll -o - -mtriple=powerpc64
    cntlzw 3, 3
    nor 3, 3, 3  <-- extended mnemonic for "not" would be nicer!
    rlwinm 3, 3, 27, 31, 31

I don't know anything about recent PPC uarch, so this may not be faster, but it
is smaller:

addic 4, 3, -1
subfe 3, 4, 3

Quuxplusone commented 7 years ago

This is in the pipeline:
https://reviews.llvm.org/D31240

Quuxplusone commented 7 years ago

(In reply to Nemanja Ivanovic from comment #1)
> This is in the pipeline:
> https://reviews.llvm.org/D31240

Nice! And wow...that's a big patch. :)

Quuxplusone / LLVMBugzillaTest

[PowerPC] icmp ne 0 can be implemented using the carry bit #31492