rubyomr-preview / rubyomr-preview

CRuby + OMR Technology Preview
328 stars 20 forks source link

What about running Java on top of OMR? #9

Open simonis opened 8 years ago

simonis commented 8 years ago

Hi,

this all sounds very exciting and interesting! I hope you'll open source the complete OMR soon. Until than I have several questions:

  1. Will it be easily possible to build a complete JDK on top of OMR (e.g. by using the class libraries from the OpenJDK project)?
  2. Which JITs and GCs from IBM J9 will be open sourced in OMR?
  3. Could you please give some more details which exact platforms (i.e. OS/CPU combinations) will be supported by OMR?
  4. What does OMR stand for?

Thanks, Volker

stefanmb commented 8 years ago

Edit: Actually, I'll defer to @charliegracie about this. Sorry for the confusion.

simonis commented 8 years ago

Thanks a lot for the fast answer Stefan!

charliegracie commented 8 years ago

Hi,

  1. The short answer is no for the time being. IBM Java 8 was built using the components present in OMR at that time and Java 9 development is consuming OMR changes as we make them. Java 9 development is one of our proof points showing that this technology can be consumed in a runtime. For a runtime to use OMR it has to provide its implementation of the "glue layer", which is an API contract for communicating between the language and OMR. The J9 glue layer will not be available as part of the OMR project. To build a complete JDK on OMR from OpenJDK you would require a different implementation of the glue layer.
  2. The initial release of OMR code will contain the GC components used to build the -Xgcpolicy:optthruput and -Xgcpolicy:optavgpause polices. We are still completing the work to de-couple the technologies underpinning -Xgcpolicy:gencon and -Xgcpolicy:balanced from Java and they will be contributed to the project as they are complete. Currently we are planning on completing work for the gencon technologies and then the balanced technologies. The core pieces of the IBM JIT technology will be made available as part of the OMR project. I am the GC technical lead so I am not the best person to answer about the finer grained details of the JIT so I will leave that to @mstoodle.
  3. The current plan for supported platforms would be: Linux x86, ppc, arm and 390 AIX zOS Windows
  4. OMR does not stand for anything. It was an internal code name for the project similar to J9 which just sort of stuck. It could be an abbreviation for many different names but it is just a code name.
mstoodle commented 8 years ago

Thanks, Charlie. For all components, including the JIT compiler, we're planning to bring anything to OMR that could be useful in a language independent way.

You asked about platforms. The platforms in Charlie's list are simply the ones the IBM Runtimes team has traditionally focused on, and so those are the ones where we have an implementation to contribute. So we'll contribute those ones, but not all at once. Other platforms will be welcome in the OMR project.

Charlie's already covered the GC stuff.

On the JIT front, the tech we'll bring to OMR is what's at the heart of all IBM Testarossa based compilers which include the Ruby JIT you can try in the Ruby+OMR Technology Preview, Java, COBOL (a compiler and a reoptimizer), C/C++, a trace compiler that accelerates a binary emulator for IBM mainframe (Z) systems, and a few other things. In the fullness of time, we will be contributing:

There's also some other stuff that's useful for VMs that's not really "JIT" or "GC", like: diagnostic support for working with runtimes and languages, integrated support for things like RDMA, GPU, FPGA, etc., our cooperative suspend model for coordinating activities within the VM, asynchronous event handling, and more.

That's a lot of stuff. It will not all show up in the first code drop, I'm afraid. My "fullness of time" comment above is simply a reference to how much work it is to refactor and clean everything up for contribution to the OMR project. We're talking about hundreds of thousands of lines of code.

We're working on the initial code drop now. From there, we'll be able to talk about our roadmap to get more things into the open so that people can see what's coming from IBM. IBM will continue to use and work on the code in the OMR project as we hope others will come to. We don't intend to fork the project for internal use only except possibly when we are stabilizing for one of our own releases, with any fixes also going back to the community code base.

You will have noticed there are some things that would really help people to build runtimes that aren't in my list above, such as infrastructure to write byte code and AST based interpreters, to translate source code into byte code or ASTs, best practices around designing byte code sets or ASTs, and lots of other things. Other people have worked on this kind of thing, and we probably won't have much code to contribute in these areas. But the grand purpose of OMR is not just for IBM to contribute its code for others to use. I would love for the OMR project to become a place where people who work on any kind of runtime can come to collaborate and share not just code but also their knowledge and wisdom around building language runtimes.

The more runtimes that get involved in this effort, the better, as that is what will make this technology relevant, more reliable and having the best capabilities for everyone. From my perspective, all runtimes are welcome, all platform implementers are welcome, all tool implementers are welcome, people who are just "interested" are welcome, and we hope everyone will get involved. The goal is to create the language runtime developer's toolkit. And just like you don't always need a screwdriver or a hammer for the job you have to do today, it's going to have stuff in it that not everybody needs. But hopefully, everybody will eventually be able to find something they can use to build even more awesome runtimes than we have today!

Ok, I'm down from my soapbox now for a project that isn't yet in the open :) . I hope that helps fill more gaps about our intentions and goals around the OMR technology. We'll get back to work on the initial drop now so we can make it a reality soon.

simonis commented 8 years ago

Hi @charliegracie, @mstoodle,

thanks a lot for your detailed answers. I hope the sources for OMR will appear soon and I'll have some time to look at them:)

One thing seems strange to me nevertheless. If the OMR components originate from IBM's J9 project and if J9 is based on and still supposed to consume new version of OMR I don't understand why you don't also release the "glue layer" which is required to run Java on top of OMR. It seems to me that Java would be the natural use case to demonstrate OMRs power and usefulness. Why come up with a new language like Ruby (which of course is interesting by itself) instead of using the one OMR was originally targeted for?

@mstoodle mentioned a "Java compiler". How is that supposed to work? Will it take Java bytecode and JIT it into OMRs intermediate representation? Will OMR contain enough functionality to use it as a Virtual Machine which can actually execute the Java code (with GC, runtime support, etc..) or is this "Java compiler" you mention more like LLVM which is a pure compiler without runtime support?

mstoodle commented 8 years ago

The "glue" layer really only makes sense in the context of the runtime it's built for. In the case of J9, the J9 glue layer 1) references tons of data structures from J9, which is closed source and not something we're approved to release, and 2) is probably the most complicated example of a glue layer there is.

On top of that, it shouldn't surprise anyone that OMR components distilled from a JVM would work well in a JVM. But the real proof point is distilling them out of Java and then making them operational in a completely different language runtime while maintaining compatibility with that runtime's semantics. I think that is not something most people would naturally expect to "just" work well.

We decided to prove to ourselves that it is not only possible but that it works well by challenging ourselves to incorporate the technology into an existing, mature, and vibrant runtime technology project like CRuby. And we're doing the necessary refactoring work in the same active development branch we used for our own production Java runtime as well as for all our other compiler products; we aren't off on some relatively quiet side branch by ourselves. The fact that we call all of this work a "proof of concept" effort just tells you how seriously we're taking this project and its broader goals.

On your "Java compiler" question: I was a bit unclear in my last comment. When I included "Java" in my list of compilers, I was referring specifically to the Java JIT compiler that's inside the J9 JVM. Java JIT compilation was why we built the Testarossa compiler infrastructure in the first place. That JIT compiler takes Java bytecode as its input and generates native code into a managed code cache. By the way, it's also capable of generating persistent native code (sometimes called Ahead of Time or AOT compiled code) for Java methods by taking advantage of the IBM shared class cache technology.

The other languages and projects in the list are IBM products for which Testarossa has been adapted over the course of the last 17 or so years. I was trying to give support for our belief that the OMR JIT will be beneficial to many language runtimes, because we have this internal data point that the technology is flexible and fairly broadly used inside IBM already. I just can't show any of it to you at the moment. :( .