eclipse / omr

Eclipse OMR™ Cross platform components for building reliable, high performance language runtimes
http://www.eclipse.org/omr
Other
934 stars 392 forks source link

C-API for JitBuilder #2049

Open dinfuehr opened 6 years ago

dinfuehr commented 6 years ago

It would be helpful to have an additional C-API for JitBuilder. This would make it much easier to use OMR in languages other than C++. Many languages like Rust or Swift provide good interoperability with C, but can't consume C++-APIs. LLVM for example also provides C-bindings for this reason.

mstoodle commented 6 years ago

Thanks for making the suggestion, @dinfuehr. There has been some discussion in the past to do something with either clang plugins or to use something like Swig to generate alternative language bindings for JitBuilder. I've also done a manual (incomplete) prototype of Java bindings for JitBuilder last year.

There's no technical reason it cannot be done, of course, since the C++ used in JitBuilder could largely be viewed as syntactic sugar on C. I think the more important challenge is how to do it in a way that can be fairly easily carried along as the JitBuilder API evolves forward (and that projects that extend the JitBuilder API can also leverage). Thinking required, suggestions welcome!

mstoodle commented 6 years ago

fwiw @dinfuehr , I'll be covering this topic a bit in my JitBuilder status and directions update webcast tomorrow at 3:30pm EST (see https://github.com/eclipse/omr/issues/2339).

dinfuehr commented 6 years ago

Thanks, might not be able to join but definitely gonna watch the recording..

mstoodle commented 6 years ago

There's a question related to how the C API might look pasted into #2397 but I'll reproduce here as it's the better place to discuss it.

The basic question is around name JitBuilder services, since C doesn't have a namespace to hide them behind.

I was imagining that every JitBuilder function will be prefixed by a "short name" for the JitBuilder class.

So IlBuilder functions are called IB_Add, IB_Sub, etc. and take an explicit IlBuilder * "receiver" parameter.

BytecodeBuilder functions are called BB_Add, BB_Sub, etc. and take an explicit BytecodeBuilder * "receiver" parameter.

TypeDictionary functions are called TD_DefineStruct, etc. and take an explicit TypeDictionary * "receiver" parameter.

Thoughts?

mstoodle commented 6 years ago

BTW, the early stages (i.e. there is some very wrong code in here) of a C API generator implemented in a branch of my OMR fork here: https://github.com/mstoodle/omr/tree/jitbuilder_api). Look in, e.g., jitbuilder/client/C_Binding.[ch]pp .)

At the point where I stopped working on that one to get the C++ client API generator to a functional state first, I had already transcribed the Simple.cpp code sample to a C version (and had it working at one point, which was kinda cool).

Any feedback on how this looks? Hopefully most of it is self-explanatory. Initialization and cleanup is required on the objects, which introduces Initialize and Destroy kinds of functions, and I tried to simplify the casts for things that, in the C++ world, are subclasses: e.g. IB(x) casts x to an IlBuilder ....MB(x) would cast x to a MethodBuilder etc.

Please ask questions and give feedback. I would much rather generate and API that people will find useful than to successfully build "my" notion of a C API that nobody wants to use.

#include "stdlib.h"
#include "stdio.h"
#include "jitbuilder.h"

int8_t
myBuildIL(IlBuilder *b)
{
        IlValue *c3 = IB_ConstInt32(b, 3);
        IB_ReturnValue(b, c3);
        return 1;
}

int
main(int argc, char *argv[])
{
        int32_t rc=initializeJit();
        if (rc == 0) {
                printf("Abort: initializeJit() returned %d\n", rc);
                exit(-2);
        }

        TypeDictionary d;
        TD_Initialize(&d);

        MethodBuilder m;
        MB_Initialize(&m, &d, NULL);

        MB_DefineName(&m, "return3");
        MB_DefineReturnType(&m, d.Int32);
        IB_setCallback_buildIL(IB(&m), &myBuildIL);

        void *entryPoint = 0;
        rc = compileMethodBuilder(&m, &entryPoint);

        MB_Destroy(&m);
        TD_Destroy(&d);

        if (rc == 0) {
                typedef int32_t (return3Func)();
                return3Func *return3 = (return3Func *)entryPoint;
                int32_t v = return3();
                printf("return3() returned %d\n", v);
                if (v == 3) {
                        printf("PASS!\n");
                        shutdownJit();
                        exit(0);
                }
        }
        else
                printf("Compile failed");

        shutdownJit();
        printf("FAIL");
        exit(-1);
}
dibyendumajumdar commented 6 years ago

Hi @mstoodle

Some thoughts below.

I think that it would be beneficial to have all calls somehow linked to a Context - so that there is not an assumption of global state. I realize that currently OMR uses global state but this can be hidden from the client.

Example:

Context = create_context() // Creates global state if called first time, maintains ref count

TypeDictionary = create_type_dictionary(Context)

MethodBuilder = create_method_builder(Context, TypeDictionary)

destroy_context(Context) // Destroys global state if ref count === 0

I think:

a) It is better to have opaque pointers rather than exposing some structure; this will also give you more flexibility in future as you can change implementation details b) All memory should be managed inside the context c) The opaque pointers can simply be reinterpret_cast<> of real JitBuilder objects (LLVM uses this approach)

I personally think a C interface should be created first as this will enable easy binding to other languages. I am not sure of the value of a C++ interface - perhaps C++ users should be allowed full access to the OMR classes.

Regards

mstoodle commented 6 years ago

Thanks for the feedback ,@dibyendumajumdar !

For the most part, that Context you're introducing would only be used when creating a TypeDictionary and possibly MethodBuilder, right (technically, the TypeDictionary could provide the context to create_method_builder, as you called it) ? Everything else could derive Context from either TypeDictionary or the "overarching" MethodBuilder?

The use of a struct was on purpose to allow things like the types (Int8, Int32, etc.) to be readily accessible to the thing you're likely to be already passing around: a builder object. I was trying not to change the "character" of the API across languages, though obviously compromises will need to be made on that point.

The compiler does have some global state, as we've been discussing in various places, but actually a lot of the compiler's managed state, at least for the OMR compiler, is managed per-thread via the compilation object which is stored in thread-local storage. Unfortunately, JitBuilder does not itself implement multiple compilation threads (we didn't bring this part into OMR from OpenJ9).

I believe the SOM++ OMR port that @charliegracie wrote is the only project that uses JitBuilder with as an asynchronous compiler. Not sure if that implementation can also initiate multiple compilation threads.

dibyendumajumdar commented 6 years ago

Hi @mstoodle ,

I am working on a C api in my fork of OMR. It is just at the starting stage so not much to see but this is an example of how I see the C api working.

#include "nj_api.h"

#include <stdio.h>

static bool build_il(JIT_ILInjectorRef ilinjector, void *userdata) {
  JIT_CreateBlocks(ilinjector, 1);
  JIT_SetCurrentBlock(ilinjector, 0);
  auto iconst = JIT_ConstInt32(42);
  auto node = JIT_CreateNode1C(OP_ireturn, iconst);
  JIT_GenerateTreeTop(ilinjector, node);
  JIT_CFGAddEdge(ilinjector, JIT_BlockAsCFGNode(JIT_GetCurrentBlock(ilinjector)), JIT_GetCFGEnd(ilinjector));
  return true;
}

int main(int argc, const char *argv[]) {

  JIT_ContextRef ctx = JIT_CreateContext();
  if (ctx) {
    JIT_FunctionBuilderRef function_builder = JIT_CreateFunctionBuilder(
        ctx, "ret1", JIT_Int32, 0, NULL, build_il, NULL);

    typedef int (*F)(void);
    F f = (F)JIT_Compile(function_builder);
    if (f)
      printf("Function call returned %d\n", f());

    JIT_DestroyFunctionBuilder(function_builder);
  }
  JIT_DestroyContext(ctx);
  return 0;
}

The api I am creating is at a lower level than JitBuilder simply because I want to understand the core compiler structures. But what I wanted to share was more the style of the api: