Open nepp2 opened 4 years ago
I've only just started to support ORC. It's still very rough.
We do have a wrapper on LLVMGetFunctionAddress
already, called get_function
.
Thanks for all this great info. Hmm..
No problem, thanks for the great library!
I found get_function
, but I think it returns a wrapper type with private members, and I couldn't get the pointer out to pass into add_global_mapping
.
Also, is the ORC stuff available, or is it still hidden away on a development branch somewhere?
It's there, it's just hidden behind the "experimental" feature flag for the time being. There's not a ton yet, but you can find it here: https://github.com/TheDan64/inkwell/blob/master/src/execution_engine.rs#L518
I think maybe add_global_mapping
should take a JitFunction
as input? Or a variant of the function should.
Yes that would work, and would probably be easier for new inkwell users to figure out. But then would you need to have some comparable abstraction to handle globals, like JitGlobal
?
I'm not sure. Could you explain what constitutes a global in this context?
These functions are generated by the EE, but we don't have globals generated by the EE do we?
I guess I don't understand what's valid to use in the global mapping.. Anything? Or does it have to exist in the EE?
Added a get_function_address
method in ef5c3cd492dec823316e7700f183d82217b85d05
@TheDan64 Ah yeah, I guess some of this is not obvious.
Basically any given EE might be hosting a module which has some global variables. For example, variables created with this function.
LLVM seems to model functions as being a specialised kind of global. The API is a bit confusing though. You create functions and globals in two different ways, and you retrieve their addresses from the EE in different ways. However, functions and globals can be linked using the same function. That function is add_global_mapping
.
So, suppose I have two modules compiled from some C-like language. The first one defines a global variable and a function:
// module 1
int b = 10;
void foo(int a) {
print(a + b);
}
The second module imports these, and defines a function of its own that makes use of them:
// module 2
int b;
void foo(int a);
void bar() {
foo(5); // prints 15
}
The global b
and the function foo
need to be linked for use in the second module. So I need to do something like this:
fn link_stuff(
ee1 : ExecutionEngine,
ee2 : ExecutionEngine,
module2 : Module)
{
let foo_address = ee1.get_function_address("foo");
let b_address = ee1.get_global_address("b")`;
let foo : FunctionValue = module2.get_function("foo").unwrap();
let b : GlobalValue = module2.get_global("b").unwrap();
ee2.add_global_mapping(foo, foo_address);
ee2.add_global_mapping(b, b_address);
}
I'm assuming one module per execution engine, as that's what the blog post i linked to earlier recommended for implementing a REPL using MCJIT. Also, in MCJIT this link step has to happen when ee1
is finalised and ee2
is not. As far as I can tell, the finalisation might be triggered as a side-effect of various different API calls (like get_function_address
). That's unfortunate because
Is your feature request related to a problem? Please describe.
I can't figure out how to link symbols across different modules using Inkwell (with LLVM 8). There may be some way that I missed, but if not it would be a major limitation. The inkwell kaleidoscope example is excellent, but it avoids cross-module linking by doing a lot of redundant recompilation instead.
The inkwell kaleidoscope example's approach to linking functions is very simple. Each time you enter a new expression into the REPL, every previous expression is recompiled into one monolithic module along with the new expression. This is done to make sure that any function you have defined is part of the same module as the new expression, which means there's no need to do cross-module linking. That works fine for a small demo, but it is not a scalable solution for a larger project.
Part of the problem is that the original C++ kaleidoscope example has lived through some big changes to the JIT API. In particular, it was previously possible to add new functions and symbols to an existing module so that no cross-module linking was required. When MCJIT was introduced, this was no longer supported. The new REPL implementation strategy was to compile each expression using a new module and execution engine, and then handle cross-module linking by overriding something called the SectionMemoryManager. You can read about this transition on the official LLVM blog.
Describe the solution you'd like
I am not sure if it is possible to override the built-in linking behaviour through the C API. In my own Inkwell-based project my solution was to link symbols manually when I compile a new module. Whenever the compiler sees an external symbol reference, it searches the previous modules for it, gets its address, and updates the new module with this address.
To do this I used the
add_global_mapping
function, but I also had to expose two extra LLVM functions in this commit on my Inkwell fork (the generic parameters are an error I fixed in the next commit).If I am right about the problem, it would be great to expose these functions properly and update the kaleidoscope example so that it matches the behaviour of the C++ version.
Describe possible drawbacks to your solution
add_global_mapping
function is the unsafe part of the operation.Describe alternatives you've considered