Open simonis opened 8 years ago
Edit: Actually, I'll defer to @charliegracie about this. Sorry for the confusion.
Thanks a lot for the fast answer Stefan!
Hi,
Thanks, Charlie. For all components, including the JIT compiler, we're planning to bring anything to OMR that could be useful in a language independent way.
You asked about platforms. The platforms in Charlie's list are simply the ones the IBM Runtimes team has traditionally focused on, and so those are the ones where we have an implementation to contribute. So we'll contribute those ones, but not all at once. Other platforms will be welcome in the OMR project.
Charlie's already covered the GC stuff.
On the JIT front, the tech we'll bring to OMR is what's at the heart of all IBM Testarossa based compilers which include the Ruby JIT you can try in the Ruby+OMR Technology Preview, Java, COBOL (a compiler and a reoptimizer), C/C++, a trace compiler that accelerates a binary emulator for IBM mainframe (Z) systems, and a few other things. In the fullness of time, we will be contributing:
There's also some other stuff that's useful for VMs that's not really "JIT" or "GC", like: diagnostic support for working with runtimes and languages, integrated support for things like RDMA, GPU, FPGA, etc., our cooperative suspend model for coordinating activities within the VM, asynchronous event handling, and more.
That's a lot of stuff. It will not all show up in the first code drop, I'm afraid. My "fullness of time" comment above is simply a reference to how much work it is to refactor and clean everything up for contribution to the OMR project. We're talking about hundreds of thousands of lines of code.
We're working on the initial code drop now. From there, we'll be able to talk about our roadmap to get more things into the open so that people can see what's coming from IBM. IBM will continue to use and work on the code in the OMR project as we hope others will come to. We don't intend to fork the project for internal use only except possibly when we are stabilizing for one of our own releases, with any fixes also going back to the community code base.
You will have noticed there are some things that would really help people to build runtimes that aren't in my list above, such as infrastructure to write byte code and AST based interpreters, to translate source code into byte code or ASTs, best practices around designing byte code sets or ASTs, and lots of other things. Other people have worked on this kind of thing, and we probably won't have much code to contribute in these areas. But the grand purpose of OMR is not just for IBM to contribute its code for others to use. I would love for the OMR project to become a place where people who work on any kind of runtime can come to collaborate and share not just code but also their knowledge and wisdom around building language runtimes.
The more runtimes that get involved in this effort, the better, as that is what will make this technology relevant, more reliable and having the best capabilities for everyone. From my perspective, all runtimes are welcome, all platform implementers are welcome, all tool implementers are welcome, people who are just "interested" are welcome, and we hope everyone will get involved. The goal is to create the language runtime developer's toolkit. And just like you don't always need a screwdriver or a hammer for the job you have to do today, it's going to have stuff in it that not everybody needs. But hopefully, everybody will eventually be able to find something they can use to build even more awesome runtimes than we have today!
Ok, I'm down from my soapbox now for a project that isn't yet in the open :) . I hope that helps fill more gaps about our intentions and goals around the OMR technology. We'll get back to work on the initial drop now so we can make it a reality soon.
Hi @charliegracie, @mstoodle,
thanks a lot for your detailed answers. I hope the sources for OMR will appear soon and I'll have some time to look at them:)
One thing seems strange to me nevertheless. If the OMR components originate from IBM's J9 project and if J9 is based on and still supposed to consume new version of OMR I don't understand why you don't also release the "glue layer" which is required to run Java on top of OMR. It seems to me that Java would be the natural use case to demonstrate OMRs power and usefulness. Why come up with a new language like Ruby (which of course is interesting by itself) instead of using the one OMR was originally targeted for?
@mstoodle mentioned a "Java compiler". How is that supposed to work? Will it take Java bytecode and JIT it into OMRs intermediate representation? Will OMR contain enough functionality to use it as a Virtual Machine which can actually execute the Java code (with GC, runtime support, etc..) or is this "Java compiler" you mention more like LLVM which is a pure compiler without runtime support?
The "glue" layer really only makes sense in the context of the runtime it's built for. In the case of J9, the J9 glue layer 1) references tons of data structures from J9, which is closed source and not something we're approved to release, and 2) is probably the most complicated example of a glue layer there is.
On top of that, it shouldn't surprise anyone that OMR components distilled from a JVM would work well in a JVM. But the real proof point is distilling them out of Java and then making them operational in a completely different language runtime while maintaining compatibility with that runtime's semantics. I think that is not something most people would naturally expect to "just" work well.
We decided to prove to ourselves that it is not only possible but that it works well by challenging ourselves to incorporate the technology into an existing, mature, and vibrant runtime technology project like CRuby. And we're doing the necessary refactoring work in the same active development branch we used for our own production Java runtime as well as for all our other compiler products; we aren't off on some relatively quiet side branch by ourselves. The fact that we call all of this work a "proof of concept" effort just tells you how seriously we're taking this project and its broader goals.
On your "Java compiler" question: I was a bit unclear in my last comment. When I included "Java" in my list of compilers, I was referring specifically to the Java JIT compiler that's inside the J9 JVM. Java JIT compilation was why we built the Testarossa compiler infrastructure in the first place. That JIT compiler takes Java bytecode as its input and generates native code into a managed code cache. By the way, it's also capable of generating persistent native code (sometimes called Ahead of Time or AOT compiled code) for Java methods by taking advantage of the IBM shared class cache technology.
The other languages and projects in the list are IBM products for which Testarossa has been adapted over the course of the last 17 or so years. I was trying to give support for our belief that the OMR JIT will be beneficial to many language runtimes, because we have this internal data point that the technology is flexible and fairly broadly used inside IBM already. I just can't show any of it to you at the moment. :( .
Hi,
this all sounds very exciting and interesting! I hope you'll open source the complete OMR soon. Until than I have several questions:
Thanks, Volker