Open 0xdaryl opened 6 years ago
Once each architecture uses a common OOL scheme, the logic that exists in each binary encoding phase to create exception table entries for the OOL sequences can be commoned. Search for addExceptionRangeForSnippet
in the context of OOL instructions to see where this needs to be done.
Out-of-line (OOL) instructions (or "Outlined Instructions" on X86) are a means by which a code generator can inject some local control flow in a normal instruction sequence typically to handle exceptional or unlikely events. Because the control flow is localized, it allows transfers of control outside of a basic block from within a basic block, and possibly a merge back into the instruction stream within the same basic block. Uses may include handling unresolved data references, arithmetic instruction overflows, and performing interpreted method dispatch prior to the method being compiled.
It is typically implemented by creating a new OOL object that encapsulate a particular sequence of instructions, switching to an alternate instruction stream pointer, generating start and end label instructions, emitting instructions to the new stream between the labels (or in some cases evaluating a Node directly), and switching the instruction stream back when finished. The register assigners can detect when a branch is either to or from an OOL sequence, and handles the situation specially because register state must be preserved exactly. Finally, the OOL sequences are iterated and emitted after the body of the method is encoded.
The implementation of OOL instructions is somewhat unique in each code generator. There was an attempt at some point to introduce a common
TR_OutOfLineCodeSection
that serves as the base class for the implementation on P, Z, ARM, and AArch64, but not much actual functionality is shared. The X86 implementation uses similar concepts but also has an independent implementation. The independent implementations also means that functionality or fixes that have occurred in one code generator may not have been applied to the others. With some re-design I think a lot more of the logic can be shared which will improve maintenance and cross-codegen functionality.Desired Features
I believe Z and X86 have the most "mature" implementations and should be harvested for functionality ideas, including:
Handle arbitrary sequences of instructions, mapped back to an owning
TR::Node
if applicable. Handling the evaluation of Nodes in the OOL sequence is handy (and desired in some contexts) must be done very carefully to ensure TR IL ordering semantics are maintained. At present, only Nodes that have been fabricated by the code generator (e.g., a call node) can be evaluated OOL.Support the creation of metadata (such as GC maps) in the OOL path that covers the range of instructions
Use RAII to hide the management of the instruction stream manipulations (similar to what's implemented in #2640).
It is desirable that the logic for taking and restoring of the register state "snapshot" from Machine be shared as much as possible. Even if the exact implementation of these functions can't be shared (yet) then a common, documented API should be established.
The logic for managing the OOL instruction list, iterating over each section, and swapping instruction streams should be shared as well.
The ultimate goal once a robust and general-purpose OOL framework is in place is to deprecate the use of hand-crafted Snippets for special-case processing tasks. There is no reason that OOL sequences can't be laid out in a very precise and controlled manner, if necessary. Such sequences will be easier to construct and maintain.
Challenges
I suspect that satisfying some of these requirements will have a "viral" effect on other aspects of the code generator. For instance, it is likely the register dependency mechanism may require some commoning as well. This is a welcome step in its evolution in my opinion, and some overhaul/re-architecting of this core piece of infrastructure is long overdue given our many years of experience working with its benefits and shortcomings (a new epic issue will be created to discuss this).
Some of the reasons for independent implementations in the different code generators involving registers are because of the dependence on a particular code generator's real registers (and perhaps more specifically the real register enum). Providing a generic notion of real register identity would go a long way to enabling more code to be shared. The special purpose registers (e.g., SpilledReg, NoReg) are also part of this enum in each code generator, but have similar meaning on all code generators yet are duplicated (another epic issue will be created to discuss this).
This issue will serve as the umbrella epic to track the work of a unified outlined instructions framework.