Libraries accessing Read/Write data through caps stored within PCC

vmurali commented 9 months ago

From my understanding, libraries work as follows: if one compartment calls a library function, it can rest assured that the data it sent the library cannot go outside the caps present in the passed arguments. If a second compartment calls the same library function again, the library function cannot access the old data passed by the first compartment unless the second compartment also passes in the necessary caps. The library relies on not having a read/write cap to global memory to ensure the above.

But, one can have a cap with read/write permissions inside the PCC region of a library that can be loaded by the library. This violates the above invariant, and the library can pass confidential data obtained from the first caller to the second caller. Is this behavior expected? If so, how are library calls not subject to the same sanitization as compartments, via the switcher? What is the expected use case of libraries?

Why is MC implicitly given to caps with EX permission? Is there any use case where you want to read caps from the PCC region? I can understand reading read-only data from the PCC region.

nwf-msr commented 9 months ago

Thank you for so clearly engaging so deeply with the CHERIoT ISA. I'm sorry that we haven't yet written a detailed rationale like big-CHERI has in their ISA.

By way of concrete examples,

the "fast unsealer" library loads a capability bearing unseal permissions through PCC: https://github.com/microsoft/cheriot-rtos/blob/43861b3f4a7da7208ecb2824d39935555d832ba1/sdk/core/token_library/token_unseal.S#L54-L61
the UART-based debug library loads a read-write capability through PCC: https://github.com/microsoft/cheriot-rtos/blob/43861b3f4a7da7208ecb2824d39935555d832ba1/sdk/lib/debug/debug.cc#L22-L25

In general, you're right that we almost always want to treat library functions as acting on (and, in particular, mutating) only data reachable through (suitably authorizing) argument capabilities. Most of the libraries in the CHERIoT SDK behave that way. But, in tension with that, sometimes we also want what might be called benign violations of this principle. (This might sound similar to benign side-effects in nominally functional languages.)

As an aside, the lack of compartment switching to and from libraries cuts both ways. The debug library additionally is not concerned with leaking "secrets" (the MMIO capability to the UART) to its caller, because that would expose no meaningful additional affordances to the caller, and so it can be written in C++. OTOH, the fast unsealer, like the switcher, must (at the moment) be carefully written assembler to ensure that secrets do not leak despite the lack of sanitization.

I think there's a strong case to be made that we don't want to be able to find MC+W-bearing capabilities through PCC, but that's not something we can easily enforce architecturally. Specifically, we can't make PCC weak or we'll break the UART example, and we don't have anything to strip MC (while iovec-like structures of pointers to flat buffers are likely common, "casting" something into that form that wasn't already seems kind of odd to want to do).

But, as often the case for things too hard for architecture, we can punt to higher levels of abstraction: IIRC, the linker report will attest to all capabilities constructed within the PCC region, and so the build system policies can ensure that libraries are not being shady.

nwf-msr commented 9 months ago

Addendum: some of this potentially interacts with a sketched ISAv2/ABIv2 that splits .rodata out to its own capability (in parallel with PCC, CGP, and SP) and brings more powerful sentries (points-to-PCC or points-to-pair from big-CHERI, letting us atomically transition both PCC and, say, CGP). In that world, having EX imply not just !W but also !MC might be quite sensible.

vmurali commented 9 months ago

Thanks Nathaniel. Is there a public version of the draft of ISAv2? By points-to-pair, do you mean CInvoke?

I understand the reason for having caps such as sealing authorities inside PCC. And at that point, it's impossible to distinguish between those caps and caps with W permission. I am less sure about the rationale for debug as a library, but that's okay.

Feel free to close the issue.

nwf-msr commented 9 months ago

There's not even a private draft of an ISAv2 yet, sorry. It's just been some scattered discussions of things we might do differently in the infinite-time limit.

Points-to-pair and points-to-PCC sentries are discussed, let us not say "documented", in C.8 of CHERI ISAv9. In quick summary, tho'...

points-to-pair is what it sounds like, a sealed pointer to a pair of capabilities, one of which becomes PCC and one of which is loaded into a fixed register (sometimes called "IDC"). It's basically "CInvoke's operands, but in memory".
points-to-PCC is more subtle; rather than loading the capability that becomes IDC, the sentry's unsealed form is written back to IDC while PCC is loaded through it. The intent is that one can lay out (probably constant) data and capabilities before a table of function pointers and then hand out sealed capabilities pointing at those function pointers.

Morello provides implementations of these and calls them, respectively,

"load pair branch" (CAP_SEAL_TYPE_LPB in the ASL, together with LDPBLR and LDPBR instructions) and
"load branch" (CAP_SEAL_TYPE_LB in the ASL, and the BLR (memory indirect) and BR (memory indirect) instructions in particular).

AFAIK, these forms are not yet implemented in any of the CHERI-RISC-V implementations. There has not yet been a huge push to explore them in software, sadly, AFAIK. (One more aside: despite being the originator of the "points-to" names, I am not wildly enamored of either that or the Morello nomenclature; if you come up with better, please advise.)

Thank you again for your detailed engagement with our project. :)

CHERIoT-Platform / cheriot-sail

Libraries accessing Read/Write data through caps stored within PCC #39