Open hawkw opened 4 years ago
@iximeow undoubtedly knows more about and/or has more opinions about this than i do
has more opinions about this than i do
i extremely have opinions, hello
i generally don't like the idea of jitting particularly because i'd love system-wide w^x and jits generally don't do w^x. there's also Strange performance implications for jits since binary layout might change and if you run out of memory while jitting a function, the error mode is.. not good. "sorry, this function call failed because you're oom" is weird, but maybe less weird than i initially thought (stack overflows also can happen, and introduce the same error. hmm.)
i was thinking that AOT compiling and caching that would be very promising, particularly because there's a lot more flexibility post-compilation (and with ABIs!) when you're guaranteed to have something you can reason about and modify. vDSO-style hacks to make getpid
fast could literally be an inlined constant! that's wild!
caching the artifacts from that compilation and parameters that were used wuold be a fair next step, and i'd want to sign those for the same error/tamper-detection reasons you mentioned. signing the whole cache at once wouldn't be great, but signing individual artifacts seems like it would work well as it wouldn't have a linearly-scaling penalty to add or remove objects.
FURTHER: if you particularly trust a program (eg if eliza wrote it) there's no reason you couldn't load it right into ring 0, trusting the code generation/wasm validation/optimizations to be correct. on the other hand if you don't trust the program (eg i wrote it), you could load it into a more restricted space with an explicit transition to do kernel tasks (spelled "ring 3" on x86, BUT this makes features like kpti, or smep, or other anti-speculation features also optional on trust levels! if codegen is deferred until load-time, you can just flip the 'this is scary please isolate as much as possible' knob to yes
.)
the extremely problematic suggestion follows: if startup time is a concern, we could also have a tiered approach where a vm is used while a module is compiled iff the module is not already present with the right compilation flags in our code cache. (cranelift's takes its sweet time doing codegen right now, and will probably get better soon enough for this to not matter, but it would be cool)
fwiw, I also think JITting is not great and would prefer to avoid. in general, while thinking through this i also thought that making the kernel solely responsible for managing the compilation of wasms into realcodes and for deciding what hardware protection ring to run it in was probably the best design; imo, the main downside is "it involves a whole buncha additional stuff we don't have to think about otherwise"
if you particularly trust a program (eg if eliza wrote it)
you are making a big mistake :) :) :) :) :)
signing individual artifacts
im assuming that we would need to use like "some kinda sgx bullshit" to generate the kernel signing key and keep it secure and i absolutely do not know about this stuff
on the other hand if you don't trust the program (eg i wrote it), you could load it into a more restricted space with an explicit transition to do kernel tasks (spelled "ring 3" on x86, BUT this makes features like kpti, or smep, or other anti-speculation features also optional on trust levels! if codegen is deferred until load-time, you can just flip the 'this is scary please isolate as much as possible' knob to
yes
.)
strong agree that there should be (eventually) different modes for "untrusted" vs "less untrusted" (note i never said say "trusted" :wink:) programs and that it's fine to accept even significantly worse performance in "random executable i received in an email from a stranger" mode. presumably this would tie in with kernel-level permissions as well (e.g. code in ring 3 maximal distrust mode cannot spawn processes that are not also maximum distrust).
i am most interested in figuring out "what is execution model for applications i actually intend to run"; locked down modes for extremely untrusted code can be added easily on top of that imo
"some kinda sgx bullshit" to generate the kernel signing key
For Maximal Security we could have the kernel whom holds the signing key encrypt it with a key derived from a passphrase or smth such that tampering would be obvious (and brick the system,,,,). i just kinda don't know sgx and it won't be present on a lot of systems so eeewwww
otherwise yes, 'just run programs and introduce trust levels later' seems like a good plan. i'm just excited
there is an important difference that will come up eventually in that x86 'syscalls' made from ring 0 need different entrypoints than ones from ring 3, so we'll probably want to just macroify wrappers or something.
otherwise yes, 'just run programs and introduce trust levels later' seems like a good plan. i'm just excited
same! i just think a good starting place is "what is the model for applications i trust?" since that will be the scenario where i most care about things like "performance". in use-cases like "i'm poking at some software to determine if it's evil or not", i think it's reasonable to assume willingless to take arbitrary perf hits — people run untrusted software in VMs etc all the time.
people run untrusted software in VMs etc all the time.
(of course, people also run trusted software in VMs all the time...i just think they shouldn't have to)
So, a thought: it definitely seems like having the kernel manage AOT compilation of wasms is the correct approach. However, it also has perhaps the most moving parts, and introduces some more subsystems (managing the compiled artefact cache + ensuring its integrity).
Would it be worthwhile to consider starting with a wasm interpreter to run wasms & adding in AOT compilation transparently? That way, we can start "running code" sooner & iterate on the userland as well. I guess my question is whether this would make our lives easier or harder?
A well-designed interface should make it easy-enough to switch between different wasm backends. It should be reasonable to start with an interpreter, and move on from there. That should help with getting pieces put together before we get AOT compilation, caching, etc. fully set-up.
Yeah, and there may also be a use-case for both, as well — e.g., if I'm running mycelium and I want to, say, test some code I'm writing, I might want to take the perf hit of the wasm interpreter in exchange for not having to spend extra time compiling my wasm into realcode. On the other hand, when the OS is fully featured enough to support doing software development on it (read: "nebulous future"), that wasm interpreter could also be a userspace thing...
However, it also has perhaps the most moving parts, and introduces some more subsystems (managing the compiled artefact cache + ensuring its integrity).
we can totally just
AOT every time we load a wasm and worry about caching later. the "AOT" bit can just write to memory and we can copy that somewhere appropriate in the kernel and run it
this reminds me that lucet has some opinions about relocations (namely that they can be applied) so maybe using wasmtime to interpret modules would be easier to get running.
i have no idea what the landscape of wasm interpreters is, whatever's there it probably would involve even fewer moving pieces to start with.
between AOT or JIT, we can probably have the same interface from the kernel, so :woman_shrugging:
between AOT or JIT, we can probably have the same interface from the kernel, so 🤷♀
yeah, that was what I thought, we can swap the execution model out transparently.
we can totally
just
AOT every time we load a wasm and worry about caching later. the "AOT" bit can just write to memory and we can copy that somewhere appropriate in the kernel and run it
👍
The simplest wasm interpreter I know of is probably watt's. It's pretty allocation-heavy, has a weird API, and contains no unsafe
code. If it's easy enough to use a proper interpreter like wasmtime's, we should go for it, but we might be able to hack something small together with watt's interpreter relatively easily. I think we'd only need Vec
(+ String
), Rc
, and (maybe) HashMap
.
or, "how is wasm formed? how user get executable?"
are we...
...running a wasm VM in the kernel?
...pre-compiling code w/ lucet?