Marking Nodes vs SymRefs as Internal Pointers

PushkarBettadpur commented 4 years ago

I noticed recently, that both Node and Symbol offer APIs to mark themselves as an internal pointer. For reference:

https://github.com/eclipse/omr/blob/a67e3890d6d1c5538539aceaa34a3287d6efb434/compiler/il/OMRSymbol.hpp#L260

and

https://github.com/eclipse/omr/blob/a67e3890d6d1c5538539aceaa34a3287d6efb434/compiler/il/OMRNode.cpp#L6103-L6111

Conceptually, what are the differences between the two, or in other words, in what situations would we want to mark a node as an internal pointer as opposed to a symbol reference?

Looping in @0xdaryl @andrewcraik for expertise.

Leonardo2718 commented 4 years ago

Maybe @vijaysun-omr can also provide input.

vijaysun-omr commented 4 years ago

Internal pointer nodes can take basically 3 forms in our IL

aladd or aiadd
aload
aRegLoad

aladd and aiadd do not have a symbol reference and as such the node is marked as an internal pointer since it is a property of a specific node and the pinning array (i.e. a pointer to the start of the array object that forms the basis for an internal pointer) symbol is a field on the node. A symbol created to hold an internal pointer is a special one anyway (it extends automatic symbol in the symbol hierarchy) since it adds a new notion of a pinning array pointer used during activities such as auto recycling and stack mapping. The internal pointer symbol's pinning array is an invariant property on the internal pointer symbol reference regardless of how many nodes use that symbol reference. So, rather than look for a field in the node (in the midst of unions on the node data structure) for aload/aRegload to hold the pinning array pointer it was decided to just use that information directly from the symbol reference.

fjeremic commented 4 years ago

So the only reason we have the TR::Node APIs for getting/setting internal pointers is because the aiadd/aladd IL do not have symbol references, and we need a way to query whether the result of such nodes hold an internal pointer. Rather than traversing the sub-tree to look for the node which does have the symbol which is marked as an internal pointer we effectively cache that fact in the TR::Node itself.

In other words, we could have implemented internal pointers with only the TR::Symbol APIs, but that would be more costly as every time we want to know if a node is holding an internal pointer we would have to traverse the sub-tree to look for any nodes with symbols which are marked as internal pointers. Is this correct?

vijaysun-omr commented 4 years ago

For an aiadd/aladd node, we may not have any symbols that are marked internal pointer symbols in the subtree. It is those nodes themselves that are the internal pointers and the first child (an address) may just be load of an array object (i.e. not an internal pointer). So there may not be any internal pointer symbol in that aiadd/aladd tree at all but we do need a place to store the pinning array pointer and hence the field on the aiadd/aladd node for that purpose.

With internal pointer support, aiadd/aladd nodes themselves can be commoned across GC point as long as the pinning array is set up properly, i.e. there is no need to create an internal pointer symbol as such if such a transformation occurs due to local commoning. There are other optimizations such as loop strider and array copy transformations that need to hold an internal pointer value (i.e. the result of an aiadd/aladd) in an auto or global register across a basic block boundary and it is in these cases that an internal pointer symbol is used.

Finally, it is not mandatory for the first child of an aiadd/aladd node that is marked to be an internal pointer to be a pinning array. The first child can be either a load of a pinning array OR a load of another internal pointer symbol that holds the result of some other prior aladd/aiadd computation. Both these can be tolerated as long as the pinning array information on the node is set up correctly by the transformation that marked the node as an internal pointer.

fjeremic commented 4 years ago

Excellent. Thanks for the explanations. @PushkarBettadpur did that make sense? If so could you please summarize the discussion here into a Doxygen comment on the APIs in question so that it is right there infront of us in the code.

PushkarBettadpur commented 4 years ago

Yes this makes sense. Thanks for the explanations @vijaysun-omr. I'll add in the comment as well @fjeremic.

eclipse / omr

Marking Nodes vs SymRefs as Internal Pointers #4871