dibyendumajumdar / dmr_c

dmr_C is a C parser and JIT compiler with LLVM, Eclipse OMR and NanoJIT backends
Other
52 stars 2 forks source link
c compiler eclipse-omr jit llvm nanojit parser preprocessor programming-language

dmr_C

The aim of the dmr_C project is to create a JIT compiler for C. dmr_C is based on the the Linux Sparse library originally written by Linus Torvalds.

The name dmr_C is a homage to Dennis M Ritchie.

Overview

dmr_C is a fork of Sparse. The main changes are:

Current status

News

Build instructions

The build is pretty standard CMake build. There are no external dependencies except the JIT backend. To build without a JIT backend just try:

mkdir build
cd build
cmake ..

This will generate appropriate build files that can then be used to build the project.

LLVM Backend

To build with LLVM support, additional arguments are needed. Following instructions are for LLVM 3.9 on Windows 10.

mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/path/to/install -DLLVM_JIT=ON -DLLVM_DIR=$LLVM_INSTALL_DIR\lib\cmake\llvm -G "Visual Studio 15 2017 Win64" ..

Here $LLVM_INSTALL_DIR refers to the path where LLVM is installed.

Generation of build scripts follows a similar process on Linux and Mac OSX platforms. Note that on Ubuntu the standard LLVM package has broken CMake files hence the recommended approach is to download and build LLVM before attempting to build dmr_C.

Following steps are how I build on Linux:

mkdir build
cd build
cmake  -DCMAKE_INSTALL_PREFIX=/path/to/install -DLLVM_JIT=ON -DLLVM_DIR=$HOME/Software/llvm501/lib/cmake/llvm -G "Unix Makefiles" ..

In my setup LLVM is installed at $HOME/Software/llvm501.

Once the build files are generated you can use the normal build tools i.e. Visual Studio on Windows and make on UNIX or Mac OSX platforms.

Assuming you specified the CMAKE_INSTALL_PREFIX you can install the header files and the library using your build script. For example, on Linux, just do:

make install

OMRJIT Backend

mkdir build
cd build
cmake -DOMR_JIT=ON -G "Visual Studio 15 2017 Win64" ..
mkdir build
cd build
cmake -DOMR_JIT=ON ..

Assuming you specified the CMAKE_INSTALL_PREFIX you can install the header files and the library using your build script. On Linux just do:

make install

NanoJIT Backend

mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/path/to/install -DNANO_JIT=ON -G "Visual Studio 15 2017 Win64" ..
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=$HOME/ravi -DNANO_JIT=ON -G "Unix Makefiles" ..

Assuming you specified the CMAKE_INSTALL_PREFIX you can install the header files and the library using your build script. On Linux just do:

make install

Using dmr_C as a JIT

dmr_C has three alternative backend JIT engines, LLVM, OMRJIT and NanoJIT. The LLVM backend is better tested and has evolved from sparse-llvm tool that comes with Sparse. The NanoJIT and OMRJIT backends are entirely new, have had less testing, and are also more limited in the features supported.

Using the LLVM backend

To use it as a LLVM based JIT you only need to invoke following function declared in header file dmr_c.h:

extern bool dmrC_llvmcompile(int argc, char **argv, LLVMModuleRef module,
                 const char *inputbuffer);

The call accepts the arguments passed to a main() function, an LLVMModuleRef, and an optional buffer to be compiled. It will preprocess files if needed, and compile each of the source files given in the argument list. It will finally compile the supplied input buffer. The results of the compilation will be in the supplied LLVMModuleRef for the calling program to use as it wishes.

Note that the LLVM backend only generates LLVM IR - compiling the IR to machine code is the client's responsibility.

A very simple use is below:

#include <dmr_c.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    int rc = 1;

    LLVMContextRef context = LLVMGetGlobalContext();
    LLVMModuleRef module =
        LLVMModuleCreateWithNameInContext("dmrC", context);
    if (module) {
        if (dmrC_llvmcompile(argc, argv, module, NULL))
            rc = 0;
        LLVMDisposeModule(module);
    }
    return rc;
}

This is basically what the sparse-llvm command (see below) does.

Limitations of LLVM backend

The code generator targetting LLVM has some limitations and is unable to correctly handle following scenarios:

Using the OMRJIT backend

This is very similar to how the LLVM backend is used. There is again a single API call to compile C code; this is declared in header file dmr_c.h. Additional steps are needed to execute the compiled code.

bool dmrC_omrcompile(int argc, char **argv, JIT_ContextRef module,
              const char *inputbuffer);

Below is what the sparse-omrjit tool does:

#include <dmr_c.h>

#include <stdbool.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    JIT_ContextRef module = JIT_CreateContext();

    int rc = 0;
    if (!dmrC_omrcompile(argc, argv, module, NULL))
        rc = 1;

    int (*fp)(void) = NULL;
    if (rc == 0) {
        /* To help with testing check if the source defined a function
         * named TestNano() and if so, execute it
         */
        fp = JIT_GetFunction(module, "TestNano");
        if (fp) {
            int fprc = fp();
            if (fprc != 0) {
                printf("TestNano Failed (%d)\n", fprc);
                rc = 1;
            } else {
                printf("TestNano OK\n");
            }
        }
    }
    JIT_DestroyContext(module);

    return rc;
}

In the example above, we check if there is a compiled function named 'TestNano'. If it is then we invoke it.

Note that you can supply optimzation option -O<n> to the compiler. The setting of 0 disables optimization, 1 enables some optimizations, and 2 is the highest level with additional optimizations enabled. Some programs may fail to compile with level 2 at present.

Limitations of OMR JIT backend

Using the NanoJIT backend

This is very similar to how the LLVM backend is used. There is again a single API call to compile C code; this is declared in header file dmr_c.h. Additional steps are needed to execute the compiled code.

bool dmrC_nanocompile(int argc, char **argv, NJXContextRef module,
              const char *inputbuffer);

Below is what the sparse-nanojit tool does:

#include <dmr_c.h>

#include <stdbool.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    NJXContextRef module = NJX_create_context(true);

    int rc = 0;
    if (!dmrC_nanocompile(argc, argv, module, NULL))
        rc = 1;

    int (*fp)(void) = NULL;
    if (rc == 0) {
        /* To help with testing check if the source defined a function
         * named TestNano() and if so, execute it
         */
        fp = NJX_get_function_by_name(module, "TestNano");
        if (fp) {
            int fprc = fp();
            if (fprc != 0) {
                printf("TestNano Failed (%d)\n", fprc);
                rc = 1;
            } else {
                printf("TestNano OK\n");
            }
        }
    }
    NJX_destroy_context(module);

    return rc;
}

In the example above, we check if there is a compiled function named 'TestNano'. If it is then we invoke it.

Limitations of NanoJIT backend

Using dmr_C command line tools

The following command line tools are built:

sparse-llvm

The sparse-llvm tool takes in a source C file and generates an LLVM module. It writes the LLVM Module in LLVM bitcode format. Note that this command line tool is only built when LLVM backend is being used.

sparse-llvm test.c -o test.bc
llc test.bc
gcc -o test test.s
sparse-llvm test.c -o test.bc
lli test.bc
sparse-llvm test.c -o test.bc
llvm-dis test.bc

linearize

The linearize tool outputs the Sparse IR generated by the parser and compiler front-end.

showsymbols

The showsymbols tool shows the global symbols found in a source file, output is generated in XML format.

showparsetree

This tool dumps the parse tree as built by the C parser - note that this tool is experimental and the output format is evolving.

sparse

The sparse tool checks C code and outputs warnings or error messages for certain conditions. For details please see Linux Sparse man page.

Bugs

Many bugs have been fixed in the code generators and the tool is able to compile and run real programs. However there are still bugs that mean that the generated code is sometimes not correct. See the tests/bugs folder for examples of programs that fail to compile successfully. If you hit a problem, please submit a bug report with a minimal example of program that fails.

Note that the NanoJIT backend is in early phase of development, and has had much less testing.

Using dmr_C as a library

The dmr_C is also a library and can be linked and used by application programs. As a library it is made up of several components:

The JIT backends

Links