Improve bitfield arithmetic

RKSimon commented 7 years ago


Bugzilla Link	34526
Version	trunk
OS	Windows NT
CC	@alexey-bataev,@legrosbuffle,@rotateright

Extended Description

We could improve math ops on irregular bitfields with the relevant SWAR style patterns:

struct Counters {
  unsigned a:7;
  unsigned b:16;
  unsigned c:9;
};

Counters IncCounters(Counters x) {
  x.a++;
  x.b++;
  x.c++;
  return x;
}

Counters AddCounters(Counters x, Counters y) {
  Counters c;
  c.a = x.a + y.a;
  c.b = x.b + y.b;
  c.c = x.c + y.c;
  return c;
}

define i32 @IncCounters(i32) {
  %2 = add i32 %0, 1
  %3 = and i32 %2, 127
  %4 = add i32 %0, 128
  %5 = and i32 %4, 8388480
  %6 = or i32 %5, %3
  %7 = add i32 %0, 8388608
  %8 = and i32 %7, -8388608
  %9 = or i32 %6, %8
  ret i32 %9
}

define i32 @AddCounters(i32, i32) {
  %3 = add i32 %1, %0
  %4 = and i32 %3, 127
  %5 = and i32 %0, 8388480
  %6 = add i32 %5, %1
  %7 = and i32 %6, 8388480
  %8 = or i32 %4, %7
  %9 = and i32 %0, -8388608
  %10 = add i32 %9, %1
  %11 = and i32 %10, -8388608
  %12 = or i32 %8, %11
  ret i32 %12
}

llc -mcpu=btver2

IncCounters(Counters): # @IncCounters(Counters)
  movl %edi, %ecx
  leal 1(%rdi), %eax
  addl $8388608, %edi # imm = 0x800000
  subl $-128, %ecx
  andl $127, %eax
  andl $-8388608, %edi # imm = 0xFF800000
  andl $8388480, %ecx # imm = 0x7FFF80
  orl %ecx, %eax
  orl %edi, %eax
  retq

AddCounters(Counters, Counters): # @AddCounters(Counters, Counters)
  movl %edi, %ecx
  leal (%rsi,%rdi), %eax
  andl $-8388608, %edi # imm = 0xFF800000
  andl $8388480, %ecx # imm = 0x7FFF80
  addl %esi, %edi
  andl $127, %eax
  addl %esi, %ecx
  andl $-8388608, %edi # imm = 0xFF800000
  andl $8388480, %ecx # imm = 0x7FFF80
  orl %ecx, %eax
  orl %edi, %eax
  retq

RKSimon commented 1 year ago

Current Codegen: https://godbolt.org/z/jKrcnxEvd

llvmbot commented 1 year ago

@llvm/issue-subscribers-good-first-issue

RKSimon commented 1 year ago

I've created an updated version to help compare SWAR patterns against current codegen: https://godbolt.org/z/x91fbTd7h

For the IncCounters case (using i8 types to help alive2):

----------------------------------------
define i8 @src(i8 %x.coerce) {
%entry:
  %inc = add i8 %x.coerce, 1
  %bf.value = and i8 %inc, 7
  %bf.value7 = add i8 %x.coerce, 8
  %bf.shl = and i8 %bf.value7, 24
  %bf.set9 = or i8 %bf.value, %bf.shl
  %0 = and i8 %x.coerce, 224
  %bf.shl15 = add i8 %0, 32
  %bf.set17 = or i8 %bf.set9, %bf.shl15
  ret i8 %bf.set17
}
=>
define i8 @tgt(i8 %x.coerce) {
%entry:
  %and = and i8 %x.coerce, 107
  %add = add i8 %and, 41
  %and6 = and i8 %x.coerce, 148
  %xor7 = xor i8 %add, %and6
  ret i8 %xor7
}
Transformation seems to be correct!

So we might be able to get some of this with some suitable InstCombine patterns

ParkHanbum commented 11 months ago

hello!! I think this issue very challenge to me. but I've study about LLVM Middlen-end and I want to study front and back at next step. so, if you tell me some other patch to help to solving this issue, I'll try and make it possible.

would you mind?

RKSimon commented 11 months ago

Sure, I'd recommend starting with a simpler bitfield:

struct Counters {
  unsigned a:7;
  unsigned b:25;
};
static_assert(sizeof(Counters) == sizeof(unsigned));

Counters IncCounters(Counters x) {
  x.a++;
  x.b++;
  return x;
}

define i32 @IncCounters(i32 %x.coerce) {
entry:
  %inc = add i32 %x.coerce, 1
  %bf.value = and i32 %inc, 127
  %0 = and i32 %x.coerce, -128
  %bf.shl = add i32 %0, 128
  %bf.set8 = or disjoint i32 %bf.value, %bf.shl
  ret i32 %bf.set8
}

You should be able to start with a simpler InstCombine fold to:

define i32 @IncCounters_Opt(i32 %x.coerce) {
entry:
  %and = and i32 %x.coerce, 2147483583
  %add = add nuw i32 %and, 129
  %and6 = and i32 %x.coerce, -2147483584
  %xor7 = xor i32 %add, %and6
  ret i32 %xor7
}

ParkHanbum commented 11 months ago

@RKSimon thanks a lot! I'll figure it out

ParkHanbum commented 11 months ago

@RKSimon I think I done with your first exam.

my code uploaded here: https://github.com/ParkHanbum/mystudy/commit/3b737a90ce334e0cec3572d4df9224ac20f9115d

its result

define i32 @src(i32 %x.coerce) {
entry:
  ...
  %and = and i32 %x.coerce, 2147483583
  %add = add nuw i32 %and, 129
  %and2 = and i32 %x.coerce, -2147483584
  %1 = xor i32 %add, %and2
  ret i32 %1
 ...
}

I'll move on next step. if you have any advice for me that's very graceful.

RKSimon commented 11 months ago

Thanks @ParkHanbum - as well as converting your test code to a PR for InstCombine, please can you create suitable alive2 links that cover the folds you are adding?

ParkHanbum commented 10 months ago

@RKSimon sure. I can send the PR around tomorrow. I'm just worried about whether I did it right.

ParkHanbum commented 10 months ago

I think I'm done. before I send PR, I found a alive2 alert of your optimizing algorithm when struct has 2 fields. but I'm solved it.

you can check it here: https://alive2.llvm.org/ce/z/dDKFdV

llvm / llvm-project

Improve bitfield arithmetic #33874

Extended Description