llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.93k stars 11.53k forks source link

Reuse flags from shift instruction -> avoid test instruction #42942

Open davidbolvansky opened 4 years ago

davidbolvansky commented 4 years ago
Bugzilla Link 43597
Version trunk
OS Linux
CC @topperc,@RKSimon,@MatzeB,@rotateright

Extended Description

int foo (int a)
{
  int b = 0;
  while (a >>= 1)
    b++;

  return b;
}
foo:                                  
        mov     eax, -1
.LBB0_1:                               
        sar     edi
        add     eax, 1
        test    edi, edi
        jne     .LBB0_1
        ret

https://c9x.me/x86/html/file_module_x86_id_285.html

"The SF, ZF, and PF flags are set according to the result. If the count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined."

So 'test' could be eliminated here.

foo:                                  
        mov     eax, -1
.LBB0_1:                               
        add     eax, 1
        sar     edi
        jne     .LBB0_1
        ret
topperc commented 4 years ago

We currently only try to use flags from shifts if the code layout happens to put the shift immediately before the branch. This is handled by X86InstrInfo::optimizeCompareInstr.

We do match flags from ADD/SUB/AND/OR/XOR during isel and never isel the TEST. But if there is another constraint that prevents the the AND/SUB/AND/OR/XOR from being the last flag op before the branch, we can end up with two AND/SUB/AND/OR/XOR instructions. One for data and one for flags with copies of GPRs to make that work. We could maybe do the same for shifts at the risk of potentially creating 2 shifts sometimes. Shifts are port restricted on Intel CPUs and have historically had bad partial flag stall behavior so this could be worse in some cases.

For this specific example we should probably just remove the loop entirely and use bsr like icc does.