Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Attempt to eliminate redundant loads whose addresses are dependent on the result of a select instruction. #21051

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR21052
Status CONFIRMED
Importance P normal
Reported by Tilmann Scheller (t.scheller@samsung.com)
Reported on 2014-09-24 05:09:33 -0700
Last modified on 2015-08-20 07:01:13 -0700
Version trunk
Hardware PC All
CC david.majnemer@gmail.com, llvm-bugs@lists.llvm.org, nlewycky@google.com, t.scheller@samsung.com
Fixed by commit(s)
Attachments 0001-Add-enhanced-version-of-reundant-load-optimization.patch (8025 bytes, text/plain)
Blocks
Blocked by
See also
while (i < e) {
    if (agg[i+1]->field < agg[i]->field)
      ++i;
    if (agg[i]->field > agg[j]->field)
      break;
    ...
    ...
  }
Quuxplusone commented 9 years ago

Attached 0001-Add-enhanced-version-of-reundant-load-optimization.patch (8025 bytes, text/plain): Patch which implements the transformation in InstCombine.

Quuxplusone commented 9 years ago
The optimization eliminates the redundant load created for the following two
statements:

agg[i+1]->field   // 1st if statement

and

++i
agg[i]->field     // 2nd if statement
Quuxplusone commented 9 years ago

The basic idea is to create a class similar to PHITransAddr but for selects rather than PHI nodes. This way we hopefully can teach GVN to "see through" select instructions when eliminating redundant loads in processLoads().

Quuxplusone commented 9 years ago
(In reply to comment #3)
> The basic idea is to create a class similar to PHITransAddr but for selects
> rather than PHI nodes. This way we hopefully can teach GVN to "see through"
> select instructions when eliminating redundant loads in processLoads().

Hi David,
IIRC, Tilmann was telling me that you make the original suggestion to implement
a class similar to PHITransAddr for select instructions.  Would you mind
providing a few more details?  I was hoping to revisit this issue.
Quuxplusone commented 9 years ago
Test case:

struct s {
  unsigned field;
};
struct s **agg;

void bar(unsigned e, unsigned j, unsigned max) {
  for (unsigned i = 0; i < e; ++i) {
    if (agg[i]->field > agg[i+1]->field)
      ++i;
    if (agg[i]->field > max)
      break;
  }
}

Assembly (AArch64):

bar:
        cbz     w0, .LBB0_5
        adrp    x8, agg
        ldr     x8, [x8, :lo12:agg]
        mov      w9, wzr
.LBB0_2:
        add     w10, w9, #1                // w9 = i; w10 = i+1;
        ldr     x11, [x8, w9, uxtw #3]     // agg[i]
        ldr     x12, [x8, w10, uxtw #3]    // agg[i+1]
        ldr      w11, [x11]                // agg[i]->field
        ldr      w12, [x12]                // agg[i+1]->field
        cmp      w11, w12                  //
        b.hi    .LBB0_4
        mov      w10, w9
.LBB0_4:
        ldr     x9, [x8, w10, uxtw #3]     // ** redundant load - agg[?]
        ldr      w11, [x9]                 // ** redundant load - agg[?]->field
        add     w9, w10, #1
        cmp      w11, w2
        ccmp    w9, w0, #2, ls
        b.lo    .LBB0_2
        ret

The first two loads in .LBB0_4 are redundant.
Quuxplusone commented 9 years ago
Nick's suggestion is to teach local memdep to return an Other(SelectInst*)
result.  In turn, GVN can recurse on the operands of the select.

There's an API problem here, however.  We only expect single results to local
queries (i.e., we can't return multiple results due to selects).  Multiple
results are only expected in non-local queries, which is the only case we go
through PHIAddrTrans translation.  Therefore, extending PHITransAddr to handle
selects isn't sufficient.
Quuxplusone commented 9 years ago

WIP: http://reviews.llvm.org/D8120