Multi-root capability systems vs. The Infinite Cap and CSR initialization

nwf commented 2 weeks ago

Problem Statement

At present, the Zcheripurecap draft presumes the existence of a single all-permissive Infinite cap (§2.3.2). This is incompatible with multi-root capability encodings (CHERIoT is once again the motivating example). This could have just been a nuisance, as in many cases the specification is careful to test relevant permissions or phrase certain comparisons as "ha[ving] infinite bounds" or similar, but then...

The spec requires that the following CSRs can initialize to this Infinite capability: pcc (§3.2.1), mtvecc (§3.7.3), mepcc (§3.7.7), stvecc (§3.8.2, if S), sepcc (§3.8.6, if S), dinfc (§3.14.9, if Sdext), ddc (§5.8.7, if Zcherihybrid), vstvecc (§6.7, if H), vsepcc (§6.11, if H).
In Zcherihybrid and with Sdext, ddc becomes Infinite on entry to debug mode, with the new CSR dddc holding the old value (§5.6).

There is some provision for multiple roots in an informative note in §3.14.9:

A future version of this specification may add writeable fields to allow creation of other capabilities, if, for example, a future extension requires multiple formats for the Infinite capability.

However, I think some more broadly-scoped verbiage is appropriate, and in particular, there should be provision for other initialization behaviors. A hypothetical Zcheriot would like something slightly tweaked from CHERIoT's initialization, say...

pcc initializes to the executable memory root (with cursor address set to the boot offset; the Infinite-like form, with cursor = base, can be obtained with an AUIPC 0; GCBASE; SCADDR sequence).
mtvecc initializes to the read-write memory root
mepcc initializes to the sealing type root

Straw proposal

I think I'd like to have §2.3.2 define Root Capabilities and The Infinite Capability, when it exists, something like this:

Capabilities form a partial order, defined by subset inclusion on authorized actions. A root capability is maximal in this partial order; a root capability is not an attenuated form of another capability. In general, capability encodings (recall [the note before 2.1; that probably wants a name and link target]) may have several root capabilities, but many, including the encoding of §2.1, have exactly one, which we call the Infinite capability.

We can then require CSR initialization to root capabilities:

In §3.2.1,

The pcc's metadata and tag are reset to a root capability bearing at least Execute (X), Access System Registers (ASR), Read (R), and Capability (C) permissions.
For mtvecc (§3.7.3) and mepcc (§3.7.7), at least, and possibly all the rest,

This CSR's reset value is a root capability bearing platform- and encoding-specific permissions. (For systems for which there is a unique Infinite capability, that is the reset value.)

davidchisnall commented 2 weeks ago

Some background:

CHERIoT made a choice to have separate W and X roots because any code that requires both needs porting to handle the new ISA and so adding a requirement for two pointers (one to write, one to execute) is fine.

CHERIv8 did not make this choice because POSIX and Win32 both have APIs (mmap and equivalents) that return a single pointer that can (with the right flags) be used for both write and execute. Note that this true even on platforms such as OpenBSD with W^X guarantees: the pointer returned from mmap is not changed, the permissions of the mapping are changed with mprotect and then the pointer changes from permitting write to permitting execute.

I am personally of the opinion that we should fix POSIX for CHERI (as Brooks proposes), any code actually doing this kind of thing needs porting and most JITs already hold multiple pointers so that they can have a writeable mapping and an executable mapping in different places. It’s one bit of additional friction for adoption though, so it’s high risk.

Note that none of this affects hybrid mode. In hybrid mode, stores are indirected via DDC and jumps via PCC, so a read-write DDC and a read-execute PCC provides the same semantics as non-CHERI systems.

CHERIoT also makes the sealing root distinct. We wanted to do this for Morello, but didn’t in the end because Morello is a superset architecture. It’s easy for a first-stage bootloader to separate out a sealing root for experimentation in a single-rooted hierarchy, it’s not possible to combine the two in a multi-routed hierarchy.

When we started designing the CHERIoT permission compression system, we created a table of all combinations and marked them as essential, desirable, unlikely to be useful, and actively harmful. The actively harmful list included things like permit-seal and permit-store in the same capability. These are harmful because the permissions refer to distinct namespaces and so a single capability that holds both is a type confusion exploit waiting to happen.

I would strongly encourage any system that supports sealing to also maintain this distinction. It permits some compression in the permission encoding (none of the memory permissions are required on the sealing root), though this is not required. Even if the encoding is orthogonal, maintaining the ‘no permissions authorising actions on disjoint namespaces’ rule is probably a good idea.

I would suggest that most of these SCRs should be initialised to null or to a capability with a disjoint root.

In addition to execute, write, and seal, I would eventually like to add an SCR / CSR permission, which generalises ASR and allows delegating permission to access specific system registers. There is a RISC-V proposal to allow indirect access to CSRs and this could be generalised to take a capability operand. If PCC has ASR, you can access everything (necessary for bootstrapping) but without ASR you have a mechanism for authorising individual CSRs / SCRs.

nwf commented 2 weeks ago

@rmn30 points out, over on https://github.com/CHERIoT-Platform/cheriot-sail/issues/21, that CHERIoT requires X permission on whatever ends up in its equivalents of mtvecc and mepcc, so I think I'd like to amend my proposal, keeping the pcc text, but then...

make mtvecc and mepcc's initialization to root caps optional (that is: if tagged, a root cap),
make mscratchc (§3.7.4) optionally initialize to a root cap, and
make mstdic (§9.2.5) optionally initialize to a root cap.

brooksdavis commented 1 week ago

I am personally of the opinion that we should fix POSIX for CHERI (as Brooks proposes), any code actually doing this kind of thing needs porting and most JITs already hold multiple pointers so that they can have a writeable mapping and an executable mapping in different places. It’s one bit of additional friction for adoption though, so it’s high risk.

While I have proposed a replacement for the mmap interface that would support W^X better (see https://people.freebsd.org/~brooks/talks/bsdcan2018-mmap/is-it-time-to-replace-mmap.pdf), I've since been able to overcome most of the issues we had with mmap except for W^X at the pointer level. Even Arm's "always return RWX caps" solution works well enough in practice.

I do suspect we could come up with a solution for JITs that is both sound and usable, but we'd need to have a solid collection of examples to convince ourselves that we aren't breaking any "common" use cases. (I'm currently pondering some sort of int mexchange(void **, int newprot, int flags); which might use the revocation machinery to ensure that pointers with the old value are replaced.) Unfortunately, there's quite a lot of code that assumes you can make RWX capabilities so there's probably at least a master project worth of work to demonstrate viability.

riscv / riscv-cheri

Multi-root capability systems vs. The Infinite Cap and CSR initialization #391

Problem Statement

Straw proposal