llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.4k stars 12.15k forks source link

Optimization: Failure to recognize right shift will result in either 0 or 1. #16181

Open silvasean opened 11 years ago

silvasean commented 11 years ago
Bugzilla Link 15809
Version trunk
OS Linux
Attachments test_bitmap.c
CC @rotateright

Extended Description

(all of this is with clang -O2 at r179948)

The code for test_bitmap_ternary is 5 bytes smaller than test_bitmap_shift, 2 instructions shorter, and avoids 2 non-immediate shifts (possibly just 1 on architectures that don't have something like x86 BT). I think the issue is a failure to recognize that there is only one set bit, so shifting back by the same amount gives exactly either 0 or 1.

int test_bitmap_ternary(int shift, int bitmap) { return ((1 << shift) & bitmap) ? 1 : 0; }

int test_bitmap_shift(int shift, int bitmap) { return ((1 << shift) & bitmap) >> shift; }

The problem is exacerbated with some more decoration (adding !! doesn't affect the codegen for test_bitmap_ternary):

int test_bitmap_shift_notnot(int shift, int bitmap) { return !!(((1 << shift) & bitmap) >> shift); }

define i32 @​test_bitmap_ternary(i32 %shift, i32 %bitmap) #​0 { %1 = shl i32 1, %shift %2 = and i32 %1, %bitmap %3 = icmp ne i32 %2, 0 %4 = zext i1 %3 to i32 ret i32 %4 }

define i32 @​test_bitmap_shift(i32 %shift, i32 %bitmap) #​0 { %1 = shl i32 1, %shift %2 = and i32 %1, %bitmap %3 = ashr i32 %2, %shift ret i32 %3 }

define i32 @​test_bitmap_shift_notnot(i32 %shift, i32 %bitmap) #​0 { %1 = shl i32 1, %shift %2 = and i32 %1, %bitmap %3 = ashr i32 %2, %shift %4 = icmp ne i32 %3, 0 %5 = zext i1 %4 to i32 ret i32 %5 }

0000000000000050 : 50: 0f a3 fe bt %edi,%esi 53: 19 c0 sbb %eax,%eax 55: 83 e0 01 and $0x1,%eax 58: c3 retq

0000000000000060 : 60: 89 f9 mov %edi,%ecx 62: b8 01 00 00 00 mov $0x1,%eax 67: d3 e0 shl %cl,%eax 69: 21 f0 and %esi,%eax 6b: d3 f8 sar %cl,%eax 6d: c3 retq

0000000000000060 : 60: 89 f9 mov %edi,%ecx 62: b8 01 00 00 00 mov $0x1,%eax 67: d3 e0 shl %cl,%eax 69: 21 f0 and %esi,%eax 6b: d3 f8 sar %cl,%eax 6d: 85 c0 test %eax,%eax 6f: 0f 95 c0 setne %al 72: 0f b6 c0 movzbl %al,%eax 75: c3 retq

rotateright commented 3 years ago

define i32 @​test_bitmap_shift(i32 %shift, i32 %bitmap) #​0 { %1 = shl i32 1, %shift %2 = and i32 %1, %bitmap %3 = ashr i32 %2, %shift ret i32 %3 }

Note that this example is not logically equivalent to the 1st (test_bitmap_ternary) because it has an "ashr" - if we're masking off the sign-bit, the result could be -1 instead of 1.

But if that was changed to use "lshr", we still don't have this optimization: https://alive2.llvm.org/ce/z/9pJFiH