WebAssembly / WASI

WebAssembly System Interface
Other
4.81k stars 249 forks source link

Capabilities granularity too low #1

Closed sunfishcode closed 5 years ago

sunfishcode commented 5 years ago

@npmccallum @dumblob

Continuing the discussion from https://github.com/CraneStation/wasmtime/issues/90, now that we have a dedicated repository (Github doesn't permit issues to be transferred between repositories).

As an example, let's say we have TCP-like sockets. We want to restrict the WASM to connect() only to a single address. If you're running on Linux (most all of us are), you just put the runtime in a cgroup and set iptables rules on that cgroup. The connect() command will fail on everything but the specified address. But this has nothing to do with the API itself. This is entirely enforced by the host OS. Where the host OS doesn't provide such capabilities, the runtime can add them. But it doesn't impact the API.

"Put the runtime in a cgroup" is a privileged operation on typical Linux systems. And, it requires a dedicated process, which many WebAssembly runtimes won't otherwise need -- WebAssembly's ability to be easily sandboxed without a process boundary is one of the key things many people are interested in it for (caveat: Spectre is a complex topic).

Lin Clark's blog post about WASI has an explanation of the capability model and why we're pursuing it for WASI. In many use cases, it's possible to set up cgroups or AppArmor or SELinux or other things, however these systems all have sufficient obstacles (configuration complexity, and the need for privileged operations), and in practice they aren't always used. And, they're all process-oriented, while the capability model allows WASI to provide more fine-grained protections. And they're all tied to Linux, while we expect WASI will be desirable in a very diverse set of environments.

(File descriptors are integers, and are therefore foregeable, but (a) even so they still provide some protections, and (b) in the future WASI will be able to represent capabilities as references which aren't forgeable.)

Assuming all of this is sensible, the next step is to apply these concepts to networking. We could create the analog of a directory for TCP-like sockets, which might look like a set of address/port/address/port/protocol tuples, possibly involving wildcards and/or netmasks or so, bundled up into a "capability", and presented to application code in the form of a file descriptor. That would then be the basis for something like a bindat system call, which would be to bind as openat is to open.

dumblob commented 5 years ago

@sunfishcode hm, you're totally right, that turing-complete stuff in security is a nightmare. In my opinion it's though possible to engineer it (the syntax, the bytecode, etc. - see Ethereum IR) in a way that it's easy to mathematically prove its correctness.

I have to admit I find the current proposal still too coarse (but sane on the other hand :wink:). Maybe we shall put most of the effort into making capabilities very easily extensible in the future (both on the WASI specification level and especially on the WASI implementation level reflecting fine-grained needs - I can imagine e.g. some minimal WASI implementation and then some security-oriented WASI implementation which will provide this additional fine-grained capability specification with a "standard library" of fine-grained capability building blocks).

There are also more things to consider (this relates to API as a whole, not just capabilities):

  1. File descriptors (or references to them) are not portable enough as they assume a tree-based filesystem which is not used everywhere (especially not on smaller devices like many smartphones or IoT devices which use rather some databases - from plain key-value DBs up to table-based or relational DBs) and therefore providing only the tree abstraction in the API sounds quite limiting and inefficient. We might consider either providing something more generic. E. g. tag-based API - i.e. having "labels/tags" (or their negations) designating mutually orthogonal capabilities whereas each label/tag can also have multiple assigned editable values - e.g. strings and floats - designating properties of the capability). Or we'll need to make the API extensible to allow implementations decide how to map efficiently to the underlying system.

  2. The network API has to support multipath TCP (the proposal with tuples src_addr, dst_addr, src_port, dst_port, protocol seems to actually support it in a sense, but still it doesn't feel like a first class citizen - see e.g. http://blog.multipath-tcp.org/blog/html/2018/12/17/multipath_tcp_apis.html ).

npmccallum commented 5 years ago

@sunfishcode Your explanation covers why you don't want to use iptables. It doesn't explain why capabilities are an API-level concern.

sunfishcode commented 5 years ago

File descriptors are just indices into a table of currently open resources, and are not inherently tree-oriented. The path APIs (currently named path_*) use file descriptors which represent directories, which are tree-oriented, but file descriptors can also represent open files, open sockets, or other things.

From feedback here and offline, it seems to make sense to at least move the path APIs into a separate module. If someone is interested in designing a database-oriented API, that would also be an interesting module (or modules) to consider.

I don't have experience with multipath TCP myself, but from the blog post there, it looks like the main things needed are something like setsockopt with extra options, and possibly some additional options for socket creation. One question is, should opening a socket in multipath mode vs non-multipath mode be separate rights?

dumblob commented 5 years ago

One question is, should opening a socket in multipath mode vs non-multipath mode be separate rights?

The answer is directly dependent on "granularity depth" of all rights/capabilities. The issue with multipath TCP is, that the "view" depends on the needs of the programmer/user at the moment. I.e. whether she wants to see separate TCP subflows or just a group of TCP subflows whereas this group itself can act as one "bigger" TCP flow.

So I believe we need to offer both in the API (I don't dare to propose how such an API shall look like as due to virtual networking we could in theory build even deeper trees of TCP (sub)flows...).

dumblob commented 5 years ago

File descriptors are just indices into a table of currently open resources, and are not inherently tree-oriented. The path APIs (currently named path_*) use file descriptors which represent directories, which are tree-oriented, but file descriptors can also represent open files, open sockets, or other things.

In that case a WASI file descriptor describes something different than a traditional file descriptor from OpenGroup (POSIX) and even different from a Windows handle. I find it very confusing and would definitely not use the term file descriptor as they won't describe a file (directory) nor other "traditional unix-like resource", but rather point to some resource-specific "bag of data" without any other meaning. If we want to separate path API into a separate module (which I'm a supporter of) and create other modules like a simple DB API, then we can't use file descriptors as already in DBs there is absolutely no such thing (it doesn't make any sense to wrap DB API around file descriptors). On the other hand we shouldn't also use e.g. the term index as it's too much DB-oriented.

Also opening a resource sounds again too traditional - I would say we're rather requesting a partial access to the resource (partial designates a continuous scale none-full) and if it matches the granted permissions (capabilities), then we'll get something like a pointer to this "instance view of the resource".

Therefore a more general term like pointer to resource or pointer to requested resource or pointer to a view of a resource could fit WASI purposes better (I'm not good at coining terms, so feel free to come up with better ideas).

sunfishcode commented 5 years ago

POSIX does use the term "file descriptor" for sockets, shared memory, and other resources. It's well-established traditional usage, though I can also see how it can cause confusion. In some UNIX circles the terms "object descriptor" or just "descriptor" appear, although they're sometimes used interchangeably with "file descriptor" even in the same document. I'd be ok switching to these other terms if there's general support for it; my only concern would be that they may not be as widely recognized.

My guess is that open granting partial access is something many people will understand, as O_RDONLY etc. are well-established, and that we're unlikely to find a word as concise and recognized as open, however I'd be ok switching to a different term if there's consensus for it.

Ericson2314 commented 5 years ago

The low granularity that exists is really nice. The paramtricity / separation logic arguments one can make are both stronger and simpler, and, coupled with https://webassembly.org/docs/future-features/#multiple-tables-and-memories, will support a rich distributive lattice of "processes" and capabilities, rather than Unix's meager discrete processes.

@npmccallum If you want more legacay support, that could be some standardized extension. Proposing that certain functions always return ENOSYS is extremely harmful, as it prevents detecting errors at compile time, and fostering a robust ecosystem respecting the full suite of layers (c.f. Rust if on embedded you still had to use std but stuff would panic. How can you trust a library to use the right subset that actually works? Eww.)

Speaking of layering, has there been any effort going back to CloudABI and changing things to match on their end? Until/unless WASI starts leveraging the specifics of WASM's memory model, it would be nice to have a defined interface orthogonal to WASM itself. Even once WASI does employ that stuff, it would be interesting to keep around the "old WASI" as a layer above or below (depending on how things shake out.) It is my experience from both Rust and https://github.com/NixOS/nixpkgs/blob/master/lib/systems/parse.nix that combinatorially exploding interfaces are good for portability, as they focus a fine-grained capability-like view of portability (depend on exactly functionality you need, not on some arbitrary Boolean expression of larger interfaces that happen to provide it.)

dumblob commented 5 years ago

The low granularity that exists is really nice. The paramtricity / separation logic arguments one can make are both stronger and simpler, and, coupled with https://webassembly.org/docs/future-features/#multiple-tables-and-memories, will support a rich distributive lattice of "processes" and capabilities, rather than Unix's meager discrete processes.

Do I understand it correctly, that the lower granularity of capabilities will then need to be managed on a lower layer in which the whole WASM runtime will run? This basically says, that to achieve higher capability granularity we'll need to deconstruct WASI apps to "services" (or alike) even smaller than the current usual "microservices" and undergo the burden of specifying in a non-portable non-standard way (each combination of a specific operating system and a specific WASM runtime implementation will have their own way) additional capabilities for the separate "microservices" (corresponding to separate memories/tables) in WASM runtime.

I hope I completely misunderstood as this would basically make the current situation even worse (because of the pressure to decompose everything into even many more "microservices" than is usual).

@npmccallum If you want more legacay support, that could be some standardized extension. Proposing that certain functions always return ENOSYS is extremely harmful, as it prevents detecting errors at compile time, and fostering a robust ecosystem respecting the full suite of layers (c.f. Rust if on embedded you still had to use std but stuff would panic. How can you trust a library to use the right subset that actually works? Eww.)

:+1:

Ericson2314 commented 5 years ago

@dumblob I loath "microservices" so I hope not :). I could rants reams on why, but I think the short answer here is normal processes/containers are bad because while the granularity of encapsulation is small, the granularity of degrees of isolation is huge: mmap and whatnot is too hard to use on it's own so the app devolves to share everything or share nothing. Share nothing means business logic gets balkanized amid endless marshalling nonsense, and all hell ensues.

The dream with capabilities is basically & and &mut for inter-process communication. In other words share exactly what you want, built up from compositional primitives.

dumblob commented 5 years ago

The dream with capabilities is basically & and &mut for inter-process communication. In other words share exactly what you want, built up from compositional primitives.

Sounds good to me, but I still can't figure out how could I accomplish that with the current "low granularity" WASI API :cry:.

kentonv commented 5 years ago

Can a WASI process (are they called "processes"?) implement a file descriptor that another process calls?

Normally in capability systems (and, indeed, object-oriented programming in general), the way we theoretically support arbitrary granularity is to allow anyone to implement their own classes representing whatever granularity they need, possibly layered on top of a courser-grained underlying API.

In a capability-based operating system API, this means one process should be able to create a file descriptor such that when another process performs operations on that file descriptor, they end up calling back into the first process.

Of course, this context switching may be a performance problem for many use cases. So we then add optimizations for common use cases. The platform/runtime has built-in support for these common cases, and can add support for new ones that prove useful over time. There's no need to predict everything in advance -- developers can get started implementing types "in userspace" and then push for runtime support later as an optimization.

File descriptors are integers, and are therefore foregeable

Terminology nitpick: In the capability-based security theory sense, we would say that file descriptors are in fact unforgeable, in that you cannot reference some other process's file descriptors by simply using the same numbers. Compare this to capability systems based on secret strings (e.g. API keys), where anyone in the world who knows the secret string can access the capability. We say that secret strings are "forgeable" although still "unguessable".

sunfishcode commented 5 years ago

Two wasm instances can share a WASI file descriptor index space. And right now, if you want to pass an open file from one instance to another, it's required that they do. So in the current system, instances can indeed forge each others' file descriptors. To the extent that instances are being used like processes, that's not ideal.

(This is one of the reasons I've been saying that WASI isn't Unix. WebAssembly doesn't have the a Unix-style process concept. There are fundamental differences, many of which derive from the ES6 module system that wasm inherits from JS.)

In the future, reference types are coming to wasm. These are unforgeable values, in every sense of the word. One function can't even forge a reference held by another function in the same instance. This is the fine-grained capability model that some people are really excited about. And, instances can pass references to each other directly. This will likely become the core capability value primitive of WASI, with integer file descriptors being interpreted as indices into a table of references (though actual implementations may do other things internally).

kentonv commented 5 years ago

In the future, reference types are coming to wasm. These are unforgeable values, in every sense of the word. One function can't even forge a reference held by another function in the same instance. This is the fine-grained capability model that some people are really excited about.

Whoa, that's exciting!

dumblob commented 5 years ago

In the future, reference types are coming to wasm. These are unforgeable values, in every sense of the word. One function can't even forge a reference held by another function in the same instance. This is the fine-grained capability model that some people are really excited about. And, instances can pass references to each other directly. This will likely become the core capability value primitive of WASI, with integer file descriptors being interpreted as indices into a table of references (though actual implementations may do other things internally).

This sounds quite solid and finally like a "full-featured" capability system (with more fine-grained possibilities), thanks.

rossberg commented 5 years ago

There are fundamental differences, many of which derive from the ES6 module system that wasm inherits from JS.

Just for the record, Wasm modules were not derived from ES6 modules, although there are natural similarities. There are substantial differences, too. Particularly relevant to this discussion is that Wasm modularity is much stronger, by the virtue of every module being completely closed by construction, i.e., not having any reference to an ambient outer scope or library; there are only imports, which are easily controlled.

sunfishcode commented 5 years ago

@rossberg Call it "designed from the outset to be compatible with" instead of "derived from" then :-).

The original questions here seem answered. Certainly there's more to talk about on the topics of capabilities and networking, but these can proceed in #20 and elsewhere. Feel free to file new issues to raise new topics and questions!