llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.03k stars 11.58k forks source link

Suboptimal codegen with __builtin_ctz (regressed in 12) #50931

Open davidbolvansky opened 3 years ago

davidbolvansky commented 3 years ago
Bugzilla Link 51589
Version trunk
OS Linux
CC @efriedma-quic,@LebedevRI,@RKSimon,@nikic,@rotateright

Extended Description

unsigned long f(unsigned long long value)
{
    unsigned int result;

    if ((value & 0xFFFFFFFF) == 0)
    {
        result = __builtin_ctz(value >> 32) + 32;
    }
    else
    {
        if ((unsigned int)value != 0)
            result = __builtin_ctz((unsigned int)value);
    }

    return result;
}

LLVM 11 -O2

f(unsigned long long):                                  # @​f(unsigned long long)
        mov     rax, rdi
        shr     rax, 32
        bsf     ecx, eax
        or      ecx, 32
        bsf     eax, edi
        cmove   eax, ecx
        ret

LLVM12+ -O2:

f(unsigned long long):                                  # @​f(unsigned long f(unsigned long long):                                  # @​f(unsigned long long)
        test    edi, edi
        je      .LBB0_1
        test    edi, edi
        je      .LBB0_3
        bsf     eax, edi
        mov     eax, eax
        ret
.LBB0_1:
        shr     rdi, 32
        bsf     eax, edi
        or      eax, 32
        mov     eax, eax
        ret
.LBB0_3:
        mov     eax, eax
        ret

https://godbolt.org/z/M9q4d13nq

efriedma-quic commented 3 years ago

Somehow we aren't canonicalizing the compare operations, which produces the duplicate "test edi, edi".

Besides that, we're just not flattening the CFG as aggressively... which seems reasonable?