llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.77k stars 11.89k forks source link

bit-scan-reverse / count-leading-zeros loop not recognized #17503

Open llvmbot opened 11 years ago

llvmbot commented 11 years ago
Bugzilla Link 17130
Version trunk
OS All
Depends On llvm/llvm-bugzilla-archive#17128
Blocks llvm/llvm-project#17461
Reporter LLVM Bugzilla Contributor
CC @hfinkel,@preames,@rotateright,@yuanfang-chen

Extended Description

This is very similar to bug 17128 (count trailing zeros), but I'm filing a separate bug just in case the solution involves some differences in implementation in the compiler.

On the surface, this one maps to existing hardware instructions (Power = ctlzw/d, ARM = clz) more easily than count-trailing-zeros. x86 has bsr/lzcnt as well as bsf/tzcnt.

Here's a test file with a couple of implementations of count-leading-zeros. Currently, llvm recognizes that both functions have bit-testing loops ('bt' instructions is generated), but doesn't know that they can be simplified to count-leading-zeros:

$ ./clang -v
clang version 3.4 (trunk 189776)
Target: x86_64-apple-darwin11.4.2
Thread model: posix

$ cat lzcnt.c 
int lzcnt(int x) {
    int count = 0;
    int i = 31;
    while (i>=0 && (((x >> i) & 0x1) == 0)) {
        i--;
        count++;
    }
    return count;
}

int lzcnt2(int x) {
    int count = 0;
    int i;
    for (i=0; i<32; i++) {
        if ((x & (1 << (31-i))) != 0) break;
        count++;
    }
    return count;
}

$ ./clang -S -O3 -fomit-frame-pointer -march=core-avx2 -o /dev/stdout lzcnt.c 
    .section    __TEXT,__text,regular,pure_instructions
    .globl  _lzcnt
    .align  4, 0x90
_lzcnt:                                 ## @lzcnt
    .cfi_startproc
## BB#0:                                ## %entry
    xorl    %eax, %eax
    movl    $32, %ecx
    .align  4, 0x90
LBB0_1:                                 ## %land.rhs
                                        ## =>This Inner Loop Header: Depth=1
    decl    %ecx
    btl %ecx, %edi
    jb  LBB0_3
## BB#2:                                ## %while.body
                                        ##   in Loop: Header=BB0_1 Depth=1
    incl    %eax
    testl   %ecx, %ecx
    jg  LBB0_1
LBB0_3:                                 ## %while.end
    ret
    .cfi_endproc

    .globl  _lzcnt2
    .align  4, 0x90
_lzcnt2:                                ## @lzcnt2
    .cfi_startproc
## BB#0:                                ## %entry
    xorl    %eax, %eax
    movl    $31, %ecx
    .align  4, 0x90
LBB1_1:                                 ## %for.body
                                        ## =>This Inner Loop Header: Depth=1
    btl %ecx, %edi
    jb  LBB1_3
## BB#2:                                ## %if.end
                                        ##   in Loop: Header=BB1_1 Depth=1
    decl    %ecx
    incl    %eax
    cmpl    $32, %eax
    jl  LBB1_1
LBB1_3:                                 ## %for.end
    ret
    .cfi_endproc
llvmbot commented 1 year ago

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

1) Assign the issue to you. 2) Fix the issue locally. 3) Run the test suite locally. 3.1) Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests. 4) Create a git commit 5) Run git clang-format HEAD~1 to format your changes. 6) Submit the patch to Phabricator. 6.1) Detailed instructions can be found here

For more instructions on how to submit a patch to LLVM, see our documentation.

If you have any further questions about this issue, don't hesitate to ask via a comment on this Github issue.

@llvm/issue-subscribers-good-first-issue

xgupta commented 5 months ago

The codegen is changed on the trunk - https://godbolt.org/z/4Tn7xsdb5, it now uses series of conditional jump and comparison.