llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.09k stars 12k forks source link

Suboptimal code for branch on ctpop(x) == 1 #47250

Open topperc opened 4 years ago

topperc commented 4 years ago
Bugzilla Link 47906
Version trunk
OS All
CC @topperc,@RKSimon,@phoebewang,@rotateright

Extended Description

This code produces suboptimal code with clang

void bar();

void foo(int x) {
  if (x && (x & (x - 1)) == 0)
    bar();
}

clang:

foo(int):                                # @foo(int)
        leal    -1(%rdi), %eax
        testl   %eax, %edi
        setne   %al
        testl   %edi, %edi
        sete    %cl
        orb     %al, %cl
        jne     .LBB0_1
        jmp     bar()                         # TAILCALL
.LBB0_1:
        retq

gcc:

foo(int):
        test    edi, edi
        je      .L1
        lea     eax, [rdi-1]
        test    eax, edi
        je      .L7
.L1:
        ret
.L7:
        jmp     bar()

The condition is canonicalized to ctpop(x) == 1 in InstCombine. Then expanded back into two conditionals during SelectionDAG. SelectionDAG doesn't have any optimizations for changing a branch on 2 setccs ored together into separate branches. I'm not sure if we could add that or if we should expand the ctpop in CodeGenPrepare.

RKSimon commented 2 years ago

Current Codegen: https://godbolt.org/z/9Gvdh5v1s

rotateright commented 2 years ago

We need to decide what is optimal. If we prefer the gcc form when the condition is used to branch, do we also want to do that without a branch? Because gcc doesn't do that currently: https://godbolt.org/z/xarhz6f4a