serializeTrivialDetachedBlock() doesn't check that the block it is serializing is really detached by the preceding detaches.

VictorYing commented 6 years ago

I just diagnosed a Swarm compiler crash and am wondering if there are any invariants that would prevent a similar crash from happening in Tapir, or you would want a change that would fix my problem to also be merged into Tapir.

Inside of SimplifyCFG, if a block is empty except for phi nodes and is terminated by a reattach, then we check the following condition to determine whether to serialize the block into its predecessors: https://github.com/wsmoses/Tapir-LLVM/blob/ba2f0eda34108ab64e581f2740b4d06eecc50709/lib/Transforms/Utils/SimplifyCFG.cpp#L6012-L6016

(Ugh, there's a tab causing alignment issues there.)

If we get through this loop without returning false, we precede to attempt to replace all the preceding detaches with branches in order to serialize them.

The problem is, this check is not carefully enough. I think It should have been

// Scan predecessors to verify that all of them detach BB.
for (BasicBlock *PredBB : predecessors(BB)) {
  auto *DI = dyn_cast<DetachInst>(PredBB->getTerminator());
  if (!DI || DI->getDetached() != BB)
    return false;
}

I want to make sure to check whether DI->getDetached() == BB inside the loop, in order to deal with cases like the following. Consider a CFG that starts out like this:

A:[detach] --detach-edge--> B:[detach] --detach-edge--> C:[some computation; reattach] -->reattach-edge-->
       \                         \_______________continue_edge___________________________________________> D:[reattach] --reattach-edge-->
        \__________________________________continue_edge_________________________________________________________________________________> E:[whatever continuation]

So A detaches B which detaches C, which does some work and reattaches to D, which immediately reattaches to E.

Now consider what happens if, during some optimization, the compiler discovers that C causes undefined behavior/will always crash, so it is impossible to reach C's terminator, the reattach to D. We're left with:

A:[detach] --detach-edge--> B:[detach] --detach-edge--> C:[crash; unreachable]
       \                         \_______________continue_edge________________> D:[reattach] --reattach-edge-->
        \__________________________________continue_edge______________________________________________________> E:[whatever continuation]

So now consider what happens if serializeTrivialDetachedBlock() runs on block D, before serializeDetachOfUnreachable() runs on block C. D contains nothing but a reattach, and its single predecessor, block B, terminates in a detach. In this case serializeTrivialDetachedBlock() decides it should try to serialize the detach in block B, but this makes no sense. It will end up crashing the compiler when it reaches this assertion: https://github.com/wsmoses/Tapir-LLVM/blob/ba2f0eda34108ab64e581f2740b4d06eecc50709/lib/Transforms/Utils/SimplifyCFG.cpp#L6022-L6023 I'm thankful that this assertion was here to help me quickly narrow down my bug.

VictorYing commented 6 years ago

For reference, the commit I made that fixed this in our Swarm compiler is here: https://github.mit.edu/swarm/Swarm-IR/pull/414/commits/3bed595b89c95b3d26ffb4b9d2e3a93978193c94

wsmoses commented 6 years ago

Will take a look shortly (apologies was away recently)

wsmoses commented 6 years ago

Yes I agree this should be done (checking the detach / reattach are of the same scope). There probably should be another CFG simplification done to allow splitting of unrelated detaches, but regardless your change would be welcomed.

Would you like to make a PR or should I add it?

VictorYing commented 6 years ago

I'm a bit busy with non-compiler work today. Could you do it?

wsmoses / Tapir-LLVM

serializeTrivialDetachedBlock() doesn't check that the block it is serializing is really detached by the preceding detaches. #72