implement single return invariant (diamond shaped procedures)

ailrst commented 7 months ago

Adds return statement and ensures a procedure always has exactly one return statement (which may not always be reachable).

To simplify this we also make the entryBlock and returnBlock of a procedure always defined. The returnBlock must contain a return, and the entryBlock is assumed to have no blocks that jump to it.

If a procedure is constructed with no blocks then it consists of an empty entryBlock that jumps directly to the returnBlock.
Adds internalBlocks containing the mutable block list
Now Procedure.blocks returns entryBlock ++ internalBlocks ++ returnBlock.
Removes all the cruft maintaining the entry and return blocks and creating them from the added blocks in case they are removed: they are now not allowed to be removed.
Adds Procedure.hasImplementation method to define whether the procedure is a stub.
the addBlock and removeBlock methods on procedure ensure that the only return statement in the procedure is the one in the return block by:
1. When a block is added its return call is replaced with a GoToReturn(returnBlock) (where gotoReturn is a subclass of GoTo that only allows one target that cannot be changed).
2. When a block is removed its return call is replaced with a Return again.

l-kent commented 6 months ago

What's the point of mandating empty entry blocks? Doesn't this just make Procedure completely redundant as a node in the CFG? It doesn't really make sense to have both?

ailrst commented 6 months ago

Rather than requiring it has no statements I think its more important to require it has no incoming (intraprocedural) jumps and only one outgoing jump.

The entry block cannot be added/removed/replaced like the rest of the blocks which makes the everything simpler because it can't be undefined, but means you can't just visit it and replace it, or add blocks before it, so its simpler and more consistent to have the entry and exit be "outside the implementation" and somewhat immutable. E.g. now setting the 'entry block' is linking the control flow rather than setting a field in procedure, which is consistent with any other control flow modification. The main thing I was thinking of was if the entry block was the head of the loop you might have to adapt the loop reducing algorithm or special case the entry block.
Yes it makes the procedure cfg node somewhat redundant from the perspective of analyses, I think its useful to have the entry/exit symmetry at the block level, as you've said ideally the iterator works at the block level not over the 3 levels of the heirarchy. I'm hesitant to remove the procedure node from the cfg though because it might make the code more convoluted but I haven't really looked at it.

l-kent commented 6 months ago

Looking at removing Procedure from the CFG sounds like a good idea to me

l-kent commented 6 months ago

We probably need to figure out more broadly how this should relate to non-returning procedures, and how they should be handled in the analyses (which generally do not account for their existence).

ailrst commented 3 months ago

We probably need to figure out more broadly how this should relate to non-returning procedures, and how they should be handled in the analyses (which generally do not account for their existence).

I think to address this we might add a "Halt" statement, and just have an analysis determine for each procedure if all paths reach a halt statement or non-returning call. I don't think it necessarily interacts with this PR, because we don't want a single halt per-procedure in the same way we do returns for backwards analysis.

Otherwise, I think this PR is overcomplicated. I like having a fixed entry and return block, but the automatic rewriting of Returns into jumps to the exit block is unneccessary. It would make more sense to just have an analysis find all the returns and rewrite the procedure into a diamond shape, and have an invariant checker make sure this stays the case through subsequent transforms.

l-kent commented 3 months ago

Is the idea that the 'halt' statement would follow a non-returning call? This could just be an 'assume false' then, but it could be made its own special type for ease of matching.

I agree that automatically rewriting returns like this is more complicated than necessary, and what you now suggest sounds like a better approach.

l-kent commented 3 months ago

I think that creating a new Return class is a good idea though. It's worth making a simpler PR that just does that part of it.

UQ-PAC / BASIL

implement single return invariant (diamond shaped procedures) #178