Geal / serverless-wasm

MIT License
179 stars 9 forks source link

investigate JIT usage #9

Open Geal opened 6 years ago

Geal commented 6 years ago

the current version uses wasmi, which is a very nice interpreter, but we might want to JIT (or more precisely AOT) compile wasm modules to native code, for better performance.

Unsolved questions right now:

Geal commented 6 years ago

a year ago, I was working on the idea of using virtualization instructions (like Intel VT-x) to create very small virtual machines, without an OS, that would just execute one function and return.

It turns out those virtualization instructions could map well to wasm: they require that we create a few memory maps for the memory and the code, create a context (ie fill registers), and provide abritrary interruptions to communicate with the host.

I think this would provide a good security umbrella for wasm in native mode.

sunfishcode commented 6 years ago

which JIT engine do we use? LLVM? Cretonne (paging @sunfishcode for this)?

There are projects using Cretonne to JIT WebAssembly on x86-64 right now, and it's complete enough to pass the entire WebAssembly testsuite and run real applications. It does take some work to embed it, though we're working on making it easier, and I'd be happy to help if you're interested in trying it.

how does wasm behaviour map to native code (see cretonne/cretonne#144 for some thoughts): where does the stack go? how do we transform host function calls to native calls? what do we do with traps?

Cretonne's answers to these questions are:

how do wasm's security guarantees hold in native code? Can it access arbitrary memory locations in the process?

WebAssembly linear memory is sandboxed when JITed to native code. Cretonne provides two options:

Cretonne does not currently provide any mitigations for Spectre.

Geal commented 6 years ago

ok! So, where should I start to support cretonne JIT in this project? Are there code examples I could follow?

sunfishcode commented 6 years ago

Here are some examples:

There's more to say about stack overflow checks and indirect call sandboxing, but that's a start. Let me know if you have any questions!

sunfishcode commented 6 years ago

Another example is the wasm runtime in Nebulet.

sunfishcode commented 6 years ago

Have you had a chance to look into this yet? If not, no worries, but if so, I'd be interested in how it's gone.

Geal commented 6 years ago

I'm have not had much time these days, and I'm working on async networking first, but this is definitely the next feature I'm working on :)

Geal commented 6 years ago

ok, so now that I've had time to play with async networking, I have a much better ida of the runtime I want, and I can get to integrating cretonne :)

sunfishcode commented 6 years ago

@Geal Have you had a chance to play with cretonne yet? If so, I'm curious how it's gone :).

Geal commented 6 years ago

@sunfishcode I started playing with it: 1fab6e2fdb41d4bc23. I used the cretonne_wasm crate to parse the wasm files (my ModuleEnvironment implementation is just a copy of DummyEnvironment for now). From there, I'm a bit confused about the required steps, so correct me if I'm wrong:

I don't really understand where I'm supposed to provide host functions. From what I understand, the ModuleEnvironment gets the list of imports from the wasm file, but I don't see how to match them to my local functions. Also, I rely on a patched version of wasmi to support pausing the interpreter. I do that with trapsso the interpreter stops immediately, then modify the stack to emulate a correct result when I jump back into the interpreter. Is it something that would be doable with that JIT implementation?

sunfishcode commented 6 years ago

Cool, I'll take a look at what you have soon!

Yes, the simplejit-demo is compiling a toy language, so it needs its own translation. cretonne-wasm performs translation for wasm.

And yeah, the infrastructure for supplying host functions isn't very advanced yet. I'll give more specific advice once I have a chance to look at your code.

If I understand your question about traps, the answer is yes: If JIT code traps for any reason (including an explicit trap instruction), you can handle it with a signal handler. And as long as the stack and register state is preserved (or saved and restored), you can jump back to it at any time.

sunfishcode commented 6 years ago

the ModuleEnvironment got a list of Function that correspond to the wasm functions compiled to cretonne IR, so I apprently don't need to reimplement the translatemethod from simplejit: https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L135-L203

That's right. cretonne-wasm can do the translation for you.

I need to create a Module and Context. I have to call the module's declare_function https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L88-L90 with the signature I have in my ModuleEnvironment, then call define_function https://github.com/sunfishcode/simplejit-demo/blob/master/src/jit.rs#L97-L99

That's right.

I can apparently create a Context from a write_data_funcaddr to indicate the addresses of my host functions? Function

write_data_funcaddr will arrange for the address of the specified function to be written into the data section.

should I use https://docs.rs/cretonne-module/0.8.0/cretonne_module/struct.Module.html#method.write_data_funcaddr then I can call my function directly: https://github.com/sunfishcode/simplejit-demo/blob/master/src/toy.rs#L46

Yeah, I don't actually know if there's a "best" way to do this yet. When you compile a function, simplejit will have a function pointer which can be called from Rust, and the only question is, what's the best way to give Rust a function pointer?

I don't really understand where I'm supposed to provide host functions. From what I understand, the ModuleEnvironment gets the list of imports from the wasm file, but I don't see how to match them to my local functions.

Since you're using simplejit, you can rely on the dlsym functionality. If you declare an Imported function, it should use dlsym (or the equivalent on Windows) to find it.