Open Geal opened 6 years ago
a year ago, I was working on the idea of using virtualization instructions (like Intel VT-x) to create very small virtual machines, without an OS, that would just execute one function and return.
It turns out those virtualization instructions could map well to wasm: they require that we create a few memory maps for the memory and the code, create a context (ie fill registers), and provide abritrary interruptions to communicate with the host.
I think this would provide a good security umbrella for wasm in native mode.
which JIT engine do we use? LLVM? Cretonne (paging @sunfishcode for this)?
There are projects using Cretonne to JIT WebAssembly on x86-64 right now, and it's complete enough to pass the entire WebAssembly testsuite and run real applications. It does take some work to embed it, though we're working on making it easier, and I'd be happy to help if you're interested in trying it.
how does wasm behaviour map to native code (see cretonne/cretonne#144 for some thoughts): where does the stack go? how do we transform host function calls to native calls? what do we do with traps?
Cretonne's answers to these questions are:
TrapSink
trait and Cretonne will tell you the address of every instruction that is expected to trap, so if you install signal handlers for the traps, you can map the signal to an expected trap.how do wasm's security guarantees hold in native code? Can it access arbitrary memory locations in the process?
WebAssembly linear memory is sandboxed when JITed to native code. Cretonne provides two options:
Cretonne does not currently provide any mitigations for Spectre.
ok! So, where should I start to support cretonne JIT in this project? Are there code examples I could follow?
Here are some examples:
There's more to say about stack overflow checks and indirect call sandboxing, but that's a start. Let me know if you have any questions!
Another example is the wasm runtime in Nebulet.
Have you had a chance to look into this yet? If not, no worries, but if so, I'd be interested in how it's gone.
I'm have not had much time these days, and I'm working on async networking first, but this is definitely the next feature I'm working on :)
ok, so now that I've had time to play with async networking, I have a much better ida of the runtime I want, and I can get to integrating cretonne :)
@Geal Have you had a chance to play with cretonne yet? If so, I'm curious how it's gone :).
@sunfishcode I started playing with it: 1fab6e2fdb41d4bc23. I used the cretonne_wasm
crate to parse the wasm files (my ModuleEnvironment
implementation is just a copy of DummyEnvironment
for now). From there, I'm a bit confused about the required steps, so correct me if I'm wrong:
ModuleEnvironment
got a list of Function
that correspond to the wasm functions compiled to cretonne IR, so I apprently don't need to reimplement the translate
method from simplejit: https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L135-L203Module<SimpleJITBackend>
and Context
. I have to call the module's declare_function
https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L88-L90 with the signature I have in my ModuleEnvironment
, then call define_function
https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L97-L99 I can apparently create a Context
from a write_data_funcaddr
to indicate the addresses of my host functions? Function
I don't really understand where I'm supposed to provide host functions. From what I understand, the ModuleEnvironment
gets the list of imports from the wasm file, but I don't see how to match them to my local functions.
Also, I rely on a patched version of wasmi to support pausing the interpreter. I do that with trapsso the interpreter stops immediately, then modify the stack to emulate a correct result when I jump back into the interpreter. Is it something that would be doable with that JIT implementation?
Cool, I'll take a look at what you have soon!
Yes, the simplejit-demo is compiling a toy language, so it needs its own translation. cretonne-wasm performs translation for wasm.
And yeah, the infrastructure for supplying host functions isn't very advanced yet. I'll give more specific advice once I have a chance to look at your code.
If I understand your question about traps, the answer is yes: If JIT code traps for any reason (including an explicit trap
instruction), you can handle it with a signal handler. And as long as the stack and register state is preserved (or saved and restored), you can jump back to it at any time.
the ModuleEnvironment got a list of Function that correspond to the wasm functions compiled to cretonne IR, so I apprently don't need to reimplement the translatemethod from simplejit: https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L135-L203
That's right. cretonne-wasm can do the translation for you.
I need to create a Module
and Context. I have to call the module's declare_function https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L88-L90 with the signature I have in my ModuleEnvironment, then call define_function https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L97-L99
That's right.
I can apparently create a Context from a write_data_funcaddr to indicate the addresses of my host functions? Function
write_data_funcaddr
will arrange for the address of the specified function to be written into the data section.
should I use https://docs.rs/cretonne-module/0.8.0/cretonne_module/struct.Module.html#method.write_data_funcaddr then I can call my function directly: https://github.com/sunfishcode/simplejit-demo/blob/master/src/toy.rs#L46
Yeah, I don't actually know if there's a "best" way to do this yet. When you compile a function, simplejit will have a function pointer which can be called from Rust, and the only question is, what's the best way to give Rust a function pointer?
I don't really understand where I'm supposed to provide host functions. From what I understand, the ModuleEnvironment gets the list of imports from the wasm file, but I don't see how to match them to my local functions.
Since you're using simplejit, you can rely on the dlsym functionality. If you declare an Imported function, it should use dlsym (or the equivalent on Windows) to find it.
the current version uses wasmi, which is a very nice interpreter, but we might want to JIT (or more precisely AOT) compile wasm modules to native code, for better performance.
Unsolved questions right now: