googleforgames / quilkin

Quilkin is a non-transparent UDP proxy specifically designed for use with large scale multiplayer dedicated game server deployments, to ensure security, access control, telemetry data, metrics and more.
Apache License 2.0
1.25k stars 89 forks source link

Filter idea: Scriptable filter #13

Open markmandel opened 4 years ago

markmandel commented 4 years ago

(Placeholder, more to work out)

The idea being, we would integrate some kind of scripting langauge into Quilkin, such that if you needed custom filters, you could script them, rather than having to use Quilkin as a library and build your own binaries.

This might tie well into #12 - since scripted filters are expected to be slower than native rust (although worth experimentation).

Useful resources:

markmandel commented 3 years ago

https://mun-lang.org/ looks interesting

XAMPPRocky commented 3 years ago

The primary blocker I encountered with adding this to the current implementation is that the ReadContext and WriteContext objects are not ABI friendly, particularly metadata being Box<dyn Any + Send> isn't something we can pass to other languages. So I think the first thing that's required here is to either restrict the current fields to only those that are ABI safe, or define a ABI-safe subset of context that we pass between languages.

markmandel commented 3 years ago

The primary blocker I encountered with adding this to the current implementation is that the ReadContext and WriteContext objects are not ABI friendly, particularly metadata being Box<dyn Any + Send> isn't something we can pass to other languages. So I think the first thing that's required here is to either restrict the current fields to only those that are ABI safe, or define a ABI-safe subset of context that we pass between languages.

I'm assuming this speaks to whether we want to support WASM? Or do we instead want to experiment with embedding a scripting language of some kind?

XAMPPRocky commented 3 years ago

This applies to any language, if we want to be able to send and receive context across language barriers it needs to be in a format that's safe to copy, and represent across the barriers. This excludes dyn Any + Send as it's not safe to represent outside of Rust. For example right now you could place and run a closure in the metadata, that's not something you can do from another language.

So what that means in practice is that the context metadata needs to be restricted to a limited set of types, essentially what you can represent is numbers, strings, lists, enums, and structs. So an ABI safe context needs to only contain those types (and no generic parameters, you can use generic types but their parameters must be set Vec<u8> not Vec<T>).

iffyio commented 2 years ago

Tested out adding both a lua and rhai script filter https://github.com/iffyio/quilkin/commit/35d2cef4682e3b310f8e7ecbf6f50c191b55a8e1 both turned out to be painfully slow unfortunately, tested with a single filter without any logic, the rhai filter was 4x slower than a native rust noop filter while lua was even slower than rhai (though I suspect part of Lua's slowness was due to having to use a Mutex to get stuff to compile, essentially serializing packet forwarding)

XAMPPRocky commented 2 years ago

both turned out to be painfully slow unfortunately, tested with a single filter without any logic, the rhai filter was 4x slower than a native rust noop filter while lua was even slower than rhai

Yeah, that's to be expected, I'm interested to see how wasmtime compares, but there's always going to be some slowdown in the process of copying the packet memory into the language VM, maybe with something like witx-bindgen we could generate more efficient bindings.

markmandel commented 2 years ago

Two thoughts:

1) Do we want to drop scripting all together? Maybe it just doesn't fit here? Our API surface does seem to be aimed at "if you want a custom filter, use quilkin as a library and write some Rust -- and then let's look at (#346) how we can make it the best experience there we can.

2) We discussed in #318 - maybe scripting is only for testing, and not for general usage?

If we did give it up -- does that make some parts of Quilkin simpler for us to design, implement and manage long term?

XAMPPRocky commented 2 years ago

Our API surface does seem to be aimed at "if you want a custom filter, use quilkin as a library and write some Rust -- and then let's look at (Reducing Configuration Representation Boilerplate #346) how we can make it the best experience there we can.

Well I don't think not having scripting would affect #346 too much really, as the language FFI bindings as it would just be another set of bindings, and the original motivation of reducing our prost and serde boilerplate is still worth it even if it doesn't have to cover FFI.

We discussed in Create More Complete End-to-End Testing Framework #318 - maybe scripting is only for testing, and not for general usage?

Well I'm not sure I see how it isn't useful in general. Not every feature's benefits depend on its performance (Also should note wasmtime is a JIT compiler, where as Lua and rhai are interpreted.) I think scripting is important because it opens the audience of Quilkin developers immensely and makes getting started with the project a lot easier. It's important to keep in mind how much work it would be for users to deploy their custom Quilkin binary for a custom filter versus being able to write a script directly in the YAML that works with the binary that's already available. This discussion also has me thinking that maybe we should consider dynamic library loaded filters to allow you to deploy a custom filter alongside Quilkin instead of needing to build your own Quilkin, but that's for another issue.

Irregardless I think having more dynamic high level options to write filters is valuable towards growing the project's community, as in my mind there's no reason for Quilkin to be thought of as a proxy for Rust projects, its potential is far greater. Scripting and in particular using WebAssembly has tonne of advantages to achieving that goal.

If we did give it up -- does that make some parts of Quilkin simpler for us to design, implement and manage long term?

So to summarise, while it might make some things simpler, I think we would be giving up a lot of the potential, and would hurt long term adoption of the project by limiting our customisability story to just using Quilkin as a Rust library.

iffyio commented 2 years ago

the reason I looked into this now was because I recently had to patch envoy's filterchain for work and it was so simple to use istio's EnvoyFilter to add a lua script dynamically and I figured it would be really powerful for quilkin too (I definitely wouldn't have wanted to write an envoy filter and maintain our own build of envoy just for that patch, though this process would be a lot simpler with quilkin I think). But there's a very large perf hit between running a script per http request vs per udp packet and personally as a user, I wouldn't choose to put either the rhai or lua implementation in front of my gameservers regardless of how easy it is to use.

Maybe we could still have a script filter but release it with a big red warning that it has perf drawbacks and you should test it out on your game first to see if/how-much its affected, because while I can't really see the script filter running on a quilkin loadbalancer, I could imagine it running on a quilkin sidecar since that has less traffic to process (and also on usecases that aren't game-traffic/latency sensitive). Regarding ease of use, I'm usually wary about features that are 'easy to get started with' but turns out they're not really production worthy in the first place which defeats its purpose (I'd rather not introduce one myself :) so I'm not opposed to dropping script entirely if we can't get something that works reasonably well for typical cases.

Re wasm, I don't know if wasm would be as simple to use or what it would look like so hard to say. e.g how do I go from my writing my .js file to running it in quilkin + what's the performance like

XAMPPRocky commented 2 years ago

Regarding ease of use, I'm usually wary about features that are 'easy to get started with' but turns out they're not really production worthy in the first place which defeats its purpose (I'd rather not introduce one myself :) so I'm not opposed to dropping script entirely if we can't get something that works reasonably well for typical cases.

Definitely, 4x is too much we need something faster, and we should still caveat, I've updated my WASM PR and ran benchmark and I think it's much more reasonable of about 25–30% overhead over having no filter.

Screenshot 2021-08-19 at 12 46 35

Re wasm, I don't know if wasm would be as simple to use or what it would look like so hard to say. e.g how do I go from my writing my .js file to running it in quilkin + what's the performance like

So right now, WASM would be harder to use than rula or rhai, because right now we would only support WebAssembly and not a language that compiles to WASM. So you'd need to compile your program to WebAssembly first before running it with Quilkin. I haven't found a nice AOT (Ahead Of Time) language for WASM, I had hoped someone had written a lua2wasm compiler but it seems like no one has yet. (Someone should steal that idea)

markmandel commented 2 years ago

Notes from community meeting:

XAMPPRocky commented 2 years ago

Noting down for later: Envoy now supports WebAssembly and have xDS configuration for it already, another thing that's pushing me towards WebAssembly for this. https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/wasm/v3/wasm.proto#wasm

FWIW how they pass configuration is by passing the WASM module a blob of JSON. I think that should be more than enough for us, though we'd want to setup the filter so that it's not passing configuration on every call. I think an API like the following should be more than sufficient.

/// Handle is an opaque reference to an instantiatied filter on the WASM side.
type Handle = !;

pub fn quilkin_filter_new(config: Json) -> Handle;
pub fn quilkin_filter_read(handle: &Handle, ctx: ReadContextJson) -> Handle;
pub fn quilkin_filter_write(handle: &Handle, ctx: WriteContextJson) -> Handle;
XAMPPRocky commented 1 year ago

Docker and WasmEdge seem to be solution for distribution we're looking for.

https://medium.com/@shyamsundarb/exploring-docker-hubs-wasm-technical-preview-76de28c3b1b4