eclipse / omr

Eclipse OMR™ Cross platform components for building reliable, high performance language runtimes
http://www.eclipse.org/omr
Other
940 stars 394 forks source link

PROPOSAL: Provide record and replay capability for JitBuilder API #1818

Open mstoodle opened 6 years ago

mstoodle commented 6 years ago

The JitBuilder API is currently leveraged primarily during compilation (specifically during IL generation) to create the input to the JIT compiler. Right now, different language compilers (for Smalltalk, Lua, Rosie, Swift, etc.) exist and make calls to the JitBuilder API to direct compilation. But the only way to test those compilers is in their own specific environment. That makes it relatively difficult to reflect those compiler needs back into the Eclipse OMR project.

If we could record the JitBuilder API calls and replay them to an independent JitBuilder library, we could at least recreate the compilation effort from other language compilers directly in OMR and bring tests into OMR that protect those other compilers as the JitBuilder API and implementation continue to evolve. In principle, it should be possible to execute that code as well, but without mock interfaces to the environments in which that code is expected to run, I do not expect it to be easy to validate the generated code using this technique.

Nonetheless, I think having the ability to store JitBuilder IL (JBIL) to a file would be very useful for reflecting our consumers use of JitBuilder at OMR. There are some other interesting facilities we could build on top of record-and-replay, but I think the ability to test should be valuable enough.

As I open other issues to move this work forward, I will add comments here to treat this as an umbrella item. I confess, I have already been moving down this path with an implementation, just to see how it might pan out, but I am still interested in any feedback including support or negative responses.

fyi @charliegracie @jduimovich @Leonardo2718

mstoodle commented 6 years ago

This issue was mistakenly closed when the PR to remove the pre-existing replay support was merged.

Leonardo2718 commented 5 years ago

Pull requests #2915 and #2968 added a generator for the C++ JitBuilder client API with the intention of allowing language bindings to be generated from a high-level description of the JitBuilder API.

Pull requests #3019 and #3056 (unmerged at time of writing) add Record and Replay functionality to JitBuilder based on what is discussed at a high level in this issue.

In an attempt to minimize the amount of work required to maintain the Recorder mechanism when the JitBuilder API changes, @mstoodle and I have been discussing the possibility of generating parts of the Recorder mechanism in a similar way as the client API.

Given that the recording aspect of the mechanism can simply be a call to a recording function whenever a service is called on an IlBuilder object, the Observer (Listener) design pattern is a good option for recording service calls on instances of the IlBuilder hierarchy. Essentially, every IlBuilder class can provide a “Listener” interface that is realized by a corresponding “Recorder” class. An instance of a Recorder class can then be registered with a corresponding IlBuilder instance as a listener. Every service in the IlBuilder classes should then notify every registered listener of a method call by invoking a corresponding method on all listener objects, of which one or more may be instances of a Recorder.

Since the Listener interfaces would closely match the JitBuilder client API itself, they can be easily generated from the JitBuilder API description. The Recorder class definitions can also be generated and the generated code can take advantage of the JitBuilderRecorder infrastructure implemented in #3019. The client API generated can also be updated to generate calls to any registered listeners for every service that requires it.

I have created a very rough prototype of how this might work here: https://github.com/Leonardo2718/omr/commits/ludwig [1].

The prototype extends the C++ API generator to also generate a Listener interface for IlBuilder, MethodBuilder, and BytecodeBuilder. Each listener interface class has a method for every service in the corresponding IlBuilder class. The listener interfaces for MethodBuilder and BytecodeBuilder extend the listener interface for IlBuilder. All listener interfaces also provide the three virtual functions cloneInto(IlBuilder * b), cloneInto(MethodBuilder * b), and cloneInto(BytecodeBuilder * b) that must be overridden by concrete listeners (clone is a bad name). These methods are used to construct new listener objects when a new object is constructed from an existing one. For example, constructing a BytecodeBuilder instance by calling MethodBuilder::OrphanBytecodeBuilder() would cause cloneInto(BytecodeBuilder * b) to be called on every listener in the the method builder object, passing the newly created BytecodeBuilder as argument. Finally the listener interface classes also have a muteHint() method that can be used to signal to a listener that it may wish to ignore subsequent notifications until the correspondingunmuteHint() is invoked. Concreted listeners can check the “mute" state by calling the provided method isMuteHintSet(), though they are not required to obey the hint.

The prototype also generates three Recorder classes that extended the listeners corresponding to IlBuilder, MethodBuilder, and BytecodeBuilder. These classes just override all the listener's methods to print the current method's name to stdout using C++'s std::cout. The Recorder classes also provide correct implementations for all the cloneInto() overloads.

Finally the prototype modifies the C++ API generator to generate IlBuilder, MethodBuilder, and BytecodeBuilder classes that have a service for registering listeners. The API generated also now notifies registered listeners of every API service call. The listener methods are called just after the corresponding implementation-side function is called, so that the client-side arguments can be given to the listener method call as well as any generated/returned value.

Currently, the Replay aspect of the mechanism is not handled, though the work #3056 should be mostly applicable.

From my discussion with @mstoodle, further work on the Record and Replay functionality should be based a similar approach as the prototype.


[1] DISCLAIMER: THIS PROTOTYPE IS A COLLECTION OF HACKS. IT SHOULD NOT BE USED IN ANY PRODUCTION ENVIRONMENT. IT DOES NOT REFLECT THE AUTHOR'S VIEW OF PROPER SOFTWARE DESIGN, ARCHITECTURE, OR DEVELOPMENT PRACTICES. IT IS A TECHNICAL PROTOTYPE AND SHOULD ONLY BE USED FOR ITS INTENDED PURPOSE OF DEMONSTRATING A PARTICULAR WAY OF SOLVING A TECHNICAL PROBLEM. THE PROTOTYPE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY ARISING FROM, OUT OF, OR IN CONNECTION WITH THE PROTOTYPE OR THE USE OR OTHER DEALINGS IN THE PROTOTYPE.

:wink:

charliegracie commented 5 years ago

If you look into the PRs for record / replay you will notice that the record is not just a simple call. For complicated APIs like ForLoop, DoWhile, and many others the code is very complicated since you have to turn the recorder off and back on at different points throughout the API. Is this the point of the muteHint? If so I am not sure how to use it and I do not think it should be able to be ignored. If it is ignored you will generate IL which can not be parsed.

Leonardo2718 commented 5 years ago

I guess I didn't explain that very well

Yes muteHint() is supposed to tell a listener to ignore subsequent messages. This could be done by wrapping the implementation of every method in a concrete listener with a check that isMuteHintSet() returns false.

However, in the prototype, this is not needed even for complex services like ForLoop(). The listeners are only notified on the client API side. Because the client side forwards calls to the implementation side and doesn't actually call any other client-side services (that's all handled on the implementation side), listeners are not notified of internal events. So, with a concrete recorder only the call to ForLoop() would be signalled and none of the internal calls.

muteHint() is provided as a general mechanism usable outside of recorders. The reason I say that it may be ignored is that some other concrete listener (that's not a recorder) could choose to actually do something with those internal calls.

charliegracie commented 5 years ago

Hmm ok I am still not sure I understand how ForLoop as an example will be implemented. I feel like another one of the APIs was even more complicated and I am not sure the generator is going to handle it....

mstoodle commented 5 years ago

My understanding is that, with the client API, e.g. ForLoopUp() is passing the clients' arguments into the OMR::IlBuilder::ForLoopUp() "implementation" method. Because the (proposed) recording listener is on the client side, all it has to do is to record that the client called ForLoopUp with those arguments (and a ReplayMethodBuilder would simply re-invoke the client side ForLoopUp along with the "reconstituted" arguments.

The implementation (mostly) doesn't need to know that the record is there because it doesn't call back "out" to the client API.

@Leonardo2718 the whole "callback" mechanism you designed does represent a path where the implementation does call back out to the client API, so there is probably still some complexity to work through here. Or, we can make a choice to simply close the door on user defined subclasses and dodge that as well as the existing complexity of that particular callback scheme. I admit, the simplification does offer some attraction....

charliegracie commented 5 years ago

One of the APIs maybe ForLoop or Switch or something similar actually needs to pass some of the new builders created during the execution of the API along to the recorder and not just the values passed in due to the fact the APIs allow the consumer to pass in the & and read the result afterwards.

mstoodle commented 5 years ago

that's a good point @charliegracie....the recorder needs to be able to listen "on the return" path not "on the call" path. It needs to do that for any generated IlValues as well.

I think the current createBuilderIfNecessary() logic in the implementation should actually move to the client API rather than being done by the implementation. That 1) simplifies the implementation and 2) simplifies the listener logic (which, as a consequence, simplifies the recorder logic).

Leonardo2718 commented 5 years ago

the recorder needs to be able to listen "on the return" path not "on the call" path. It needs to do that for any generated IlValues as well.

That should easy enough to do (I think). All that's needed is to generate code that notifies listeners after the ARG_RETURN and ARRAY_ARG_RETURN macros are invoked.

Regarding the "callback" mechanism, my thinking is that recording should be enabled for their implementation. Since callbacks are, by definition, user implemented the recording mechanism needs to be made aware of what the callback is doing. Otherwise, replay functionality will not be able to correctly reproduce what a previously recorded compilation has done.