Open RKSimon opened 7 years ago
Current Codegen: https://godbolt.org/z/jKrcnxEvd
@llvm/issue-subscribers-good-first-issue
I've created an updated version to help compare SWAR patterns against current codegen: https://godbolt.org/z/x91fbTd7h
For the IncCounters case (using i8 types to help alive2):
----------------------------------------
define i8 @src(i8 %x.coerce) {
%entry:
%inc = add i8 %x.coerce, 1
%bf.value = and i8 %inc, 7
%bf.value7 = add i8 %x.coerce, 8
%bf.shl = and i8 %bf.value7, 24
%bf.set9 = or i8 %bf.value, %bf.shl
%0 = and i8 %x.coerce, 224
%bf.shl15 = add i8 %0, 32
%bf.set17 = or i8 %bf.set9, %bf.shl15
ret i8 %bf.set17
}
=>
define i8 @tgt(i8 %x.coerce) {
%entry:
%and = and i8 %x.coerce, 107
%add = add i8 %and, 41
%and6 = and i8 %x.coerce, 148
%xor7 = xor i8 %add, %and6
ret i8 %xor7
}
Transformation seems to be correct!
So we might be able to get some of this with some suitable InstCombine patterns
hello!! I think this issue very challenge to me. but I've study about LLVM Middlen-end and I want to study front and back at next step. so, if you tell me some other patch to help to solving this issue, I'll try and make it possible.
would you mind?
Sure, I'd recommend starting with a simpler bitfield:
struct Counters {
unsigned a:7;
unsigned b:25;
};
static_assert(sizeof(Counters) == sizeof(unsigned));
Counters IncCounters(Counters x) {
x.a++;
x.b++;
return x;
}
define i32 @IncCounters(i32 %x.coerce) {
entry:
%inc = add i32 %x.coerce, 1
%bf.value = and i32 %inc, 127
%0 = and i32 %x.coerce, -128
%bf.shl = add i32 %0, 128
%bf.set8 = or disjoint i32 %bf.value, %bf.shl
ret i32 %bf.set8
}
You should be able to start with a simpler InstCombine fold to:
define i32 @IncCounters_Opt(i32 %x.coerce) {
entry:
%and = and i32 %x.coerce, 2147483583
%add = add nuw i32 %and, 129
%and6 = and i32 %x.coerce, -2147483584
%xor7 = xor i32 %add, %and6
ret i32 %xor7
}
@RKSimon thanks a lot! I'll figure it out
@RKSimon I think I done with your first exam.
my code uploaded here: https://github.com/ParkHanbum/mystudy/commit/3b737a90ce334e0cec3572d4df9224ac20f9115d
its result
define i32 @src(i32 %x.coerce) {
entry:
...
%and = and i32 %x.coerce, 2147483583
%add = add nuw i32 %and, 129
%and2 = and i32 %x.coerce, -2147483584
%1 = xor i32 %add, %and2
ret i32 %1
...
}
I'll move on next step. if you have any advice for me that's very graceful.
Thanks @ParkHanbum - as well as converting your test code to a PR for InstCombine, please can you create suitable alive2 links that cover the folds you are adding?
@RKSimon sure. I can send the PR around tomorrow. I'm just worried about whether I did it right.
I think I'm done. before I send PR, I found a alive2 alert of your optimizing algorithm when struct has 2 fields. but I'm solved it.
you can check it here: https://alive2.llvm.org/ce/z/dDKFdV
Extended Description
We could improve math ops on irregular bitfields with the relevant SWAR style patterns:
llc -mcpu=btver2