Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Non-affine loops not code generated corretly #32237

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR33265
Status NEW
Importance P enhancement
Reported by Tobias Grosser (tobias@grosser.es)
Reported on 2017-06-01 03:29:05 -0700
Last modified on 2021-09-12 14:38:25 -0700
Version unspecified
Hardware PC Linux
CC EmmaJaneBonestell@gmail.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments non-affine-loop-not-working.ll (1681 bytes, text/plain)
Blocks
Blocked by
See also
Created attachment 18556
Test case

The attached test case breaks as follows:

po /tmp/non-affine-loop-not-working.ll -polly-allow-nonaffine-loops -polly-
process-unprofitable -polly-codegen

Instruction does not dominate all uses!
  %indvar = phi i64 [ %indvar.next, %bb9 ], [ 0, %bb2 ]
  %p_indvar.next = add i64 %indvar, 1
Instruction does not dominate all uses!
  %indvar = phi i64 [ %indvar.next, %bb9 ], [ 0, %bb2 ]
  %0 = add i64 %x, %indvar
LLVM ERROR: Broken function found, compilation aborted!

It seems non-affine loops are broken in this case. Unfortunately it seems we
have no code-generation test coverage from the point when they have been
introduced. :(

As they are disabled by default that is not breaking anything in the default
configuration, but we should still fix this and add test cases to not regress
further.
Quuxplusone commented 7 years ago

Attached non-affine-loop-not-working.ll (1681 bytes, text/plain): Test case

Quuxplusone commented 7 years ago
This seems to be broken since:

commit c397ad2f2a4badc0f028e1e60d991f9ddc09bec1
Author: Tobias Grosser <tobias@grosser.es>
Date:   Thu Jan 19 14:12:45 2017 +0000

    BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts

    Before this change we created an additional reload in the copy of the incoming
    block of a PHI node to reload the incoming value, even though the necessary
    value has already been made available by the normally generated scalar loads.
    In this change, we drop the code that generates this redundant reload and
    instead just reuse the scalar value already available.

    Besides making the generated code slightly cleaner, this change also makes sure
    that scalar loads go through the normal logic, which means they can be remapped
    (e.g. to array slots) and corresponding code is generated to load from the
    remapped location. Without this change, the original scalar load at the
    beginning of the non-affine region would have been remapped, but the redundant
    scalar load would continue to load from the old PHI slot location.

    It might be possible to further simplify the code in addOperandToPHI,
    but this would not only mean to pull out getNewValue, but to also change the
    insertion point update logic. As this did not work when trying it the first
    time, this change is likely not trivial. To not introduce bugs last minute, we
    postpone further simplications to a subsequent commit.

    We also document the current behavior a little bit better.

    Reviewed By: Meinersbur

    Differential Revision: https://reviews.llvm.org/D28892

    git-svn-id: https://llvm.org/svn/llvm-project/polly/trunk@292486

This bug can be worked around for this one test case with the following patch:

-    // Get the reloaded value.
-    OpCopy = getNewValue(Stmt, PHI, BBCopyMap, LTS, getLoopForStmt(Stmt));
+    MemoryAccess *Access = Stmt.getPHIAccessOrNULLFor(PHI);
+    if (Access) {
+      // Get the reloaded value.
+      OpCopy = getNewValue(Stmt, PHI, BBCopyMap, LTS, getLoopForStmt(Stmt));
+    } else {
+      // Get some global variables.
+      Value *Op = PHI->getIncomingValueForBlock(IncomingBB);
+      OpCopy = getNewValue(Stmt, Op, BBCopyMap, LTS, getLoopForStmt(Stmt));
+    }

but this patch does not work in general. The issue is that we expect the
incoming values to be available at the index PHI in the BBMap, which is
generally ensured through a PHI-READ. However, if 'Op' is synthesizable, now
PHI-READ-ACCESS is added to the model, instead we try to synthesize PHI, which
does not work as we still have references to the old IV in the synthesizable
expressions. Ideas to address this:

1) Make the scalar evolution expression not synthesizable, by checking if any
of its loop is part of a non-affine region.