envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.1k stars 4.82k forks source link

Support WebAssembly (WASM) in Envoy #4272

Open louiscryan opened 6 years ago

louiscryan commented 6 years ago

Title: Support WebAssembly (WASM) in Envoy

Description: WebAssembly [1] provides an embeddable and safe execution environment for platform extensions. While primarily intended for the Web browsers the WASM community is also supportive of [1] other embeddings. WASM is now supported by the major browser vendors and has significant development momentum behind it. In comparison with LuaJit it has the advantage of institutional support and perhaps more interestingly a compiler toolchain that works with many popular languages C/C++, Rust, Golang, Typescript (AssemblyScript [3]) etc.

There are clearly many use-cases for embedding in Envoy, from simple HTTP header customization to custom protocol handlers. We've had quite a bit of experience building C++ extensions to Envoy as part of Istio and I think many if not all of them could easily target a WASM runtime instead and as such we are willing to contribute to building this out.

I've chatted with the v8 team and this is something they're interested in supporting too but they need some time to make the component more separable. Perhaps as a first prototype using something like the wabt [4] interpreter would make sense but suggestions welcome

Relevant Links [1] https://webassembly.org/ [2] https://webassembly.org/docs/non-web/ [3] https://github.com/AssemblyScript/assemblyscript [4] https://github.com/WebAssembly/wabt#running-wasm-interp

@htuch @lizan @rshriram

htuch commented 6 years ago

Also, we could maybe compile and run Envoy in WASM, since we build under Clang today. Not sure of practical applications beyond a form of sandboxing. Will be challenging to deal with threading model and networking.

louiscryan commented 6 years ago

FYI @dio

louiscryan commented 6 years ago

@rianhunter (Wasmjit) @AndrewScheidecker (WAVM) @titzer @binji (WABT & v8)

Hey WASM folks! Wondering if you'd be willing to provide your 2c about the suitability of your projects for this effort.

John Plevyak did a little tire kicking with WAVM and I've chatted a bit with Ben Smith & Ben Titzer about the roadmap for v8 and progress on a standard embedding proposal

/cc @jplevyak2 @PiotrSikora

rianhunter commented 6 years ago

Hey thanks for adding me. Wasmjit can be suitable for this with a couple of caveats:

  1. You may need to provide your own runtime environment, Wasmjit allows for this.
  2. Wasmjit currently only supports x86_64, an interpreter will be added in the not too distant future.

On the first point, I'm currently in the process of building out an emscripten-compatible runtime in Wasmjit. This essentially means providing POSIX system calls as host functions with which client WebAssembly modules can link. This allows you to run normal C/C++ programs compiled with emscripten that target POSIX. This may be what you want but since you're doing this as a plugin interface, it seems more likely that you'd want to build your own runtime environment and disallow the plugins from calling into POSIX system calls. You can relatively easily do this with wasmjit, and I can hold your hand a bit as well.

On the second point, I don't know what your target platform is. If it's x86_64, then hop on board. If it isn't, then you'll have to wait as I add an interpreter.

One last note, Wasmjit was architected to be embeddable. It's <10000 SLOC, it's architecturally simple, and doesn't require libc. You should be able to relatively easily copy the source files into your repo and build.

AndrewScheidecker commented 6 years ago

WAVM is also meant to be used in this kind of scenario where it is embedded in another application, though I think with quite different trade-offs from wasmjit. It uses libc and LLVM.

LLVM is quite heavy-weight in terms of both code size and the time it takes to translate the WebAssembly code to native code. It wouldn't make sense in a browser VM, but may be well-suited to an application that compiles a WebAssembly module once at startup, and spends a lot of time running the generated code.

WAVM uses signal handling to make bounds-checking of memory accesses as fast as possible. I believe that v8 supports both signal handling and a slower fallback if installing a signal handler isn't possible. I am very curious to know what your constraints around signal handling are:

On the first point, I'm currently in the process of building out an emscripten-compatible runtime in Wasmjit. This essentially means providing POSIX system calls as host functions with which client WebAssembly modules can link. This allows you to run normal C/C++ programs compiled with emscripten that target POSIX. This may be what you want but since you're doing this as a plugin interface, it seems more likely that you'd want to build your own runtime environment and disallow the plugins from calling into POSIX system calls. You can relatively easily do this with wasmjit, and I can hold your hand a bit as well.

I've had trouble maintaining the emulation of the interface that Emscripten uses to call its bundled JavaScript code from the generated WebAssembly code. The interface doesn't change very fast, but it is an ad hoc interface and cares more about download size than being easy to emulate, so I think emulating it is a dead end.

For Wavix, I'm trying to define a more stable syscall ABI. Envoy would probably want to implement a toolchain in the same way, but hook libc up to a more limited syscall ABI that doesn't provide full POSIX functionality.

It looks like there might be some motion in the last day or two to standardize a syscall ABI and a libc that uses it here, but I'm not sure yet what the scope of that project is.

rianhunter commented 6 years ago

Wasmjit is somewhat targeted at constrained environments where all the code can be relatively easily audited, memory allocation is tight, and advanced facilities like signals aren't provided. The trade off here is that the generated code will perform worse, but not sure how much worse than, say, LLVM (which is probably the high end of Wasm code generation). The upside is that we have a shot of getting merged into mainline Linux which is our primary goal.

RE: Emscripten runtime API. My long term goal is to coordinate with the emscripten team to evolve their ad-hoc API into a standard POSIX API for WASM. I would also support any other effort to standardize a POSIX API, but I'm not aware of any project is able to compile standard POSIX conforming programs to WebAssembly today like emscripten. Wavix looks interesting, I see it's modeled after musl.

jplevyak2 commented 6 years ago

I can take a couple of these:

The use case of pre-compiling WASM in the control plane is interesting as it would save recompiling on every node. Some form of co-routines would be interesting (the ability to restart the VM after it has called out e.g setjmp/longjmp).

On Wed, Oct 3, 2018 at 7:15 PM Andrew Scheidecker notifications@github.com wrote:

WAVM is also meant to be used in this kind of scenario where it is embedded in another application, though I think with quite different trade-offs from wasmjit. It uses libc and LLVM.

LLVM is quite heavy-weight in terms of both code size and the time it takes to translate the WebAssembly code to native code. It wouldn't make sense in a browser VM, but may be well-suited to an application that compiles a WebAssembly module once at startup, and spends a lot of time running the generated code.

WAVM uses signal handling to make bounds-checking of memory accesses as fast as possible. I believe that v8 supports both signal handling and a slower fallback if installing a signal handler isn't possible. I am very curious to know what your constraints around signal handling are:

  • Is it possible for WAVM to install a SIGSEGV/SIGBUS/SIGFPE handler?
  • Do you have an existing signal handler for SIGSEGV/SIGBUS/SIGFPE that would need to cooperate with WAVM's signal handler?
  • Do you need to be able to handle WebAssembly traps? If so, do you prefer an exception or a return code?

On the first point, I'm currently in the process of building out an emscripten-compatible runtime in Wasmjit. This essentially means providing POSIX system calls as host functions with which client WebAssembly modules can link. This allows you to run normal C/C++ programs compiled with emscripten that target POSIX. This may be what you want but since you're doing this as a plugin interface, it seems more likely that you'd want to build your own runtime environment and disallow the plugins from calling into POSIX system calls. You can relatively easily do this with wasmjit, and I can hold your hand a bit as well.

I've had trouble maintaining the emulation of the interface that Emscripten uses to call its bundled JavaScript code from the generated WebAssembly code. The interface doesn't change very fast, but it is an ad hoc interface and cares more about download size than being easy to emulate, so I think emulating it is a dead end.

For Wavix https://github.com/WAVM/Wavix, I'm trying to define a more stable syscall ABI. Envoy would probably want to implement a toolchain in the same way, but hook libc up to a more limited syscall ABI that doesn't provide full POSIX functionality.

It looks like there might be some motion in the last day or two to standardize a syscall ABI and a libc that uses it here https://github.com/WebAssembly/reference-sysroot, but I'm not sure yet what the scope of that project is.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/envoyproxy/envoy/issues/4272#issuecomment-426862797, or mute the thread https://github.com/notifications/unsubscribe-auth/AT9tnReo2olXnfh-mnMzxSRE_XcFY-4Rks5uhW9QgaJpZM4WOmDl .

binji commented 6 years ago

It looks like there might be some motion in the last day or two to standardize a syscall ABI and a libc that uses it here, but I'm not sure yet what the scope of that project is.

cc @sunfishcode and @lukewagner for more info about WebAssembly/reference-sysroot.

The use case of pre-compiling WASM in the control plane is interesting as it would save recompiling on every node.

Depending on how far you'd want to push this, there is also the possibility of using tools like wasm2c.

sunfishcode commented 6 years ago

The reference-sysroot project is meant as a first step in that direction. It doesn't yet specify any syscalls, but these are the kinds of discussions it's aiming to support.

rianhunter commented 6 years ago

@sunfishcode Can we move this discussion somewhere else? I have a few comments I'd like to make on a potential design. Also maybe worth pulling in the relevant people from emscripten? Not sure if @kripken is the one who designed the emscripten interface.

sunfishcode commented 6 years ago

@rianhunter Sure, we can open an issue here.

mattklein123 commented 6 years ago

All, one high level question from me: Does WASM support coroutines? IMO this is what makes the Lua support so amazing from an and user standpoint. Mainly curious if we can replicate this programming simplicity in WASM or if the WASM plugins will need to follow the standard Envoy async filter API model.

In terms of arch support, x86_64 is highest priority, but demand for ARM will increase steadily (both v8 and v7 IMO) so I think some plan for how we get to native ARM would be useful to understand (not interpreted).

binji commented 6 years ago

Does WASM support coroutines?

Not yet, but there have been plans to add it for a while, and some ideas how it might work.

Golang wasm implements their coroutines by wrapping the functions in a switch(basic_block) construct, and providing their own scheduler, though that will have a performance hit.

dio commented 6 years ago

What is the blessed way of letting .wasm access host's APIs (envoy will be the host)? Host bindings (https://github.com/WebAssembly/design/issues/1148)?

binji commented 6 years ago

@dio: currently it depends on how you embed the wasm VM in your application. That said, we have discussed creating a C/C++ embedding API to standardize this. You can see the current work-in-progress at rossberg/wasm-c-api, though AFAIK it only supports v8 at the moment.

lukewagner commented 6 years ago

@dio Host Bindings (which maybe should now be renamed "Web IDL Bindings") are mostly just useful for removing the need for JS glue code in a standard JS/wasm environment, not exposing new APIs. Rather, the way to expose host-specific functionality to wasm is just through function imports. In a new host environment, you would probably want to define certain magic module name strings which, when imported by wasm modules running on your host, contain as exports whatever host-defined functions you want. Thus, when wasm imports and calls an export from your host-builtin module, it's really calling into your host's native code. Then you can make up whatever interface you like, e.g., taking i32, i32 pairs to denote [begin, end) ranges of linear memory, etc, because the callee is native code that can do whatever it wants.

But of course it makes sense to avoid an unnecessary proliferation of host-specific builtin module APIs that cause .wasms to be needless unportable, so I'm definitely excited about standardizing some portable builtin modules like @sunfishcode said.

dio commented 6 years ago

Thanks, @binji, @lukewagner! I'll take a look at those refs.

drichelson commented 5 years ago

This is really exciting. Thanks for making this magic happen! Any updates in the new year on wasm support in Envoy?

PiotrSikora commented 5 years ago

@jplevyak and me are both actively working on this.

There should be a design doc soon (I'm working on it right now), and PRs in late Q1 / early Q2.

sambercovici commented 5 years ago

I concur, this is really exiting. Can you share any progress on the design / PRs? Do you plan to address the required ABI so that the WASM code can interact with Envoy? Do you also plan to address xDS extensions that can be done to configure such WASM modules?

jplevyak commented 5 years ago

New pull request for a null sandbox, compiling the C++ code directly into Envoy while using the WASM PROXY API: https://github.com/istio/envoy/pull/63

jplevyak commented 5 years ago

Pull request merged: gRPC support for WASM https://github.com/istio/envoy/pull/55

mattklein123 commented 5 years ago

@envoyproxy/wasm-dev I made you https://github.com/envoyproxy/envoy-wasm and you should all have write access. LMK if you need anything else.

PiotrSikora commented 5 years ago

@mattklein123 could you re-create it as a fork of envoyproxy/envoy? Otherwise, we can't open PRs against it from the existing personal forks of "envoy" repo.

mattklein123 commented 5 years ago

@PiotrSikora unfortunately you can't fork a project into the same org, which I agree is really unfortunate. You will need to fork that repo.

ae6rt commented 5 years ago

I'll follow my own advice around asking dumb questions: if you have one, just get it out there and ask: Will WASM in an Envoy context be able to read and write Unix domain sockets?

erikbos commented 4 years ago

Will wasm support per route specific filtering?