Open Quuxplusone opened 9 years ago
Attached 0001-Add-enhanced-version-of-reundant-load-optimization.patch
(8025 bytes, text/plain): Patch which implements the transformation in InstCombine.
The optimization eliminates the redundant load created for the following two
statements:
agg[i+1]->field // 1st if statement
and
++i
agg[i]->field // 2nd if statement
The basic idea is to create a class similar to PHITransAddr but for selects rather than PHI nodes. This way we hopefully can teach GVN to "see through" select instructions when eliminating redundant loads in processLoads().
(In reply to comment #3)
> The basic idea is to create a class similar to PHITransAddr but for selects
> rather than PHI nodes. This way we hopefully can teach GVN to "see through"
> select instructions when eliminating redundant loads in processLoads().
Hi David,
IIRC, Tilmann was telling me that you make the original suggestion to implement
a class similar to PHITransAddr for select instructions. Would you mind
providing a few more details? I was hoping to revisit this issue.
Test case:
struct s {
unsigned field;
};
struct s **agg;
void bar(unsigned e, unsigned j, unsigned max) {
for (unsigned i = 0; i < e; ++i) {
if (agg[i]->field > agg[i+1]->field)
++i;
if (agg[i]->field > max)
break;
}
}
Assembly (AArch64):
bar:
cbz w0, .LBB0_5
adrp x8, agg
ldr x8, [x8, :lo12:agg]
mov w9, wzr
.LBB0_2:
add w10, w9, #1 // w9 = i; w10 = i+1;
ldr x11, [x8, w9, uxtw #3] // agg[i]
ldr x12, [x8, w10, uxtw #3] // agg[i+1]
ldr w11, [x11] // agg[i]->field
ldr w12, [x12] // agg[i+1]->field
cmp w11, w12 //
b.hi .LBB0_4
mov w10, w9
.LBB0_4:
ldr x9, [x8, w10, uxtw #3] // ** redundant load - agg[?]
ldr w11, [x9] // ** redundant load - agg[?]->field
add w9, w10, #1
cmp w11, w2
ccmp w9, w0, #2, ls
b.lo .LBB0_2
ret
The first two loads in .LBB0_4 are redundant.
Nick's suggestion is to teach local memdep to return an Other(SelectInst*)
result. In turn, GVN can recurse on the operands of the select.
There's an API problem here, however. We only expect single results to local
queries (i.e., we can't return multiple results due to selects). Multiple
results are only expected in non-local queries, which is the only case we go
through PHIAddrTrans translation. Therefore, extending PHITransAddr to handle
selects isn't sufficient.
0001-Add-enhanced-version-of-reundant-load-optimization.patch
(8025 bytes, text/plain)