eclipse / omr

Eclipse OMR™ Cross platform components for building reliable, high performance language runtimes
http://www.eclipse.org/omr
Other
945 stars 396 forks source link

Develop a specification for TR IL #2569

Open 0xdaryl opened 6 years ago

0xdaryl commented 6 years ago

As the intermediate representation consumed by the Eclipse OMR compiler technology (known as TR IL) becomes consumed in language runtimes beyond OpenJ9, it is important to ensure its semantics are clearly defined and well understood to prevent inconsistency and ambiguity. This is important not only for consistency between projects consuming OMR but for different architecture backends within a particular project. The goal of this effort is to produce stand-alone documentation (either in the GitHub repo or on the Eclipse OMR website) for TR IL as well as improve the documentation in the code.

Some ideas for content and tasks for this effort include:

0xdaryl commented 6 years ago

@vijaysun-omr @mstoodle @andrewcraik @jdmpapin @Leonardo2718 @ymanton @fjeremic @mpirvu @gita-omr @xliang6

A discussion topic for the May 23 Compiler Architecture meeting.

mstoodle commented 6 years ago

Unfortunately, I won't be able to attend this meeting. It looks like you've got more than enough to discuss in a single meeting, though :) .

Some quick observations:

  1. I think Block needs to be included as a part of the IL, since you need to be able to talk about how our trees (DAGs) can reach backwards through an extended basic block but not outside an extended block.

  2. It will be easy for this discussion on the specification to devolve into a discussion of the attributes of the current implementation, but our implementation has many areas of "looseness" (dodging many less flattering terms). I suggest we try to focus on essential elements that cover the bulk of the current implementation or that capture the aspirational "spirit" of the Testarossa IL and try to avoid getting bogged down in the plethora of exception cases. One example rathole, since we've gone down it multiple times before, will be the typed versus typeless opcode debate. My opinion (since I won't be there to express it at the meeting directly): TR IL opcodes should be typed except in a few very well documented cases where typed opcode explosion would be massive with very little value (e.g. vector to vector type conversion). Where typeless notions are valuable, we should implement typeless queries on top of a fundamentally typed opcode space. But the core principle should be that opcodes are typed (except where they're not :) ).

  3. (another potential rathole; maybe initially just focus on what the output will be from this meeting?) How do we intend to document, maintain, test?, validate? this specification?

  4. Some other advanced topics to consider: registers, linkages (calling conventions), because aspects of these appear in the IL, though my argument below suggests I don't think they're part of the IL :) .

  5. What is an IL? How about some polished variant of: everything needed to describe the operations that will be performed by a body of code that cannot exist independently of those operations. For example, a TR_ResolvedMethod is something that refers to a method somewhere. The compiler can operate with a TR_ResolvedMethod without needing to know its constituent operations. That TR_ResolvedMethod object does not need to be part of the method being compiled (although it usually will be :) ). On this basis, I think my starting position on whether TR_ResolvedMethod is part of the IL would be: it isn't. But I'm willing to be convinced otherwise.

0xdaryl commented 6 years ago

Thanks for your input @mstoodle . Our initial meeting (#2570) is really just to introduce this topic and to get everyone thinking about it. The actual formulation of such a specification is likely to take time (and will be revisted in subsequent arch meetings). I would like to hear thoughts similar to yours on what people think should or should not be part of any specification.

dibyendumajumdar commented 6 years ago

Hi - is there an update on this? I understood that an initial version of the doc would be available soon?

0xdaryl commented 6 years ago

I'm expecting to have something describing the structure of the IL and basic concepts by the end of summer (July/August timeframe). It won't land in one blob either--you will see incremental contributions as I have time to add to it. A true specification covering each and every IL opcode in TR IL will take some time though, but I will set a goal of end of year where we will have taken at least a first pass through all the opcodes.

Sorry for the delay (usual excuses about other work getting in the way apply). This is a priority for this project and is long overdue for this technology. Your experience integrating the compiler code with your projects and the questions you've asked are very valuable in helping us understand where the weak spots in the documentation are and how the onboarding to new projects can be improved (thank you!). Please keep the questions coming, and we hope to provide quick and meaningful answers to keep you productive!