LLVM-Backend Generated Interpreter C API Documentation

sskeirik commented 1 year ago

The LLVM backend, in conjunction with the tool llvm-kompile, can generate object files which contain the entire definition of an K semantics as machine object code.

These object files can then be used to link the LLVM-generated interpreters into custom C code, or any language that supports C FFIs.

However, currently, the interface that the LLVM-backend generated interpreters expose is a mixture of hard-coded and compile-time generated symbols, and there is no documentation of which sybmols are exported or best practices around, e.g., which symbols are considered part of a stable API and which symbols are considered internal and should not be explicitly linked against.

Such documentation would not only enable K developers to more easily productize K by adapting to specific client needs, it would also enable more possible uses of K in the broader community.

For those reasons, I think it is worth documenting these symbols (even if the documentation is marked experimental, as needed).

Baltoli commented 1 year ago

https://github.com/runtimeverification/llvm-backend/blob/master/bindings/c/include/kllvm-c/kllvm-c.h

There is a limited C API that we support (currently consumed by the new Haskell backend to implement concrete simplification). There are more features that one can get access to in these object files (i.e. actual concrete execution), but they're not tested or exposed via this header.

I agree that it needs documentation; there is an ongoing effort to improve the documentation of the backend in general: #621. We are also planning a blog post demonstrating how to use this part of K.

sskeirik commented 1 year ago

As part of this issue, I realized that I didn't understand the architecture of the LLVM backend --- and some care is needed in this matter because of the possibility of link time failures.

After reviewing all of the steps, it seems that we have the following situation:

the LLVM backend itself is compiled, including all of its various components:
- the Java/Scala decision tree generator that produces the dt_*.yaml files
- all of the LLVM-backend native tools are built including
  - llvm-kompile-codegen - takes kore file and dt_*.yaml files and produces object file containing semantics implementation (excluding hooked functions)
  - kprint - takes kore term and definition.kore file and pretty prints the kore term
- the LLVM-backend interpreter runtime libraries are compiled into static library archives
kompile is invoked which does the following:
- frontend takes K file, does analysis/error checking/etc. and produces definition.kore
- backend uses the above info to produce the decision tree directory
- backend invokes llvm-compile-codegen binary which takes definition.kore (for symbols) and decision tree directory (for rules) and compiles semantics implementation into an object file (minus hooked functions)
- backend invokes opt binary (from LLVM project) to optimize semantics object file
- backend invokes clang to link object file generated in previous step with runtime library archives (for hooked function definitions) compiled with the LLVM backend itself (and possibly with any user-provided C code/object files)

Is the picture I have painted above accurate? This would be helpful to iron out both for the blog post and for our internal documentation.

Baltoli commented 1 year ago

Is the picture I have painted above accurate?

Yes, that's a very good summary of the backend architecture - thanks @sskeirik!

Baltoli commented 1 year ago

Initial documentation here: https://github.com/runtimeverification/llvm-backend/pull/707

Still plenty more to write + clean up the runtime API header.

runtimeverification / llvm-backend

LLVM-Backend Generated Interpreter C API Documentation #692