Open ltratt opened 2 years ago
Our general belief is that compartments cannot safely be given raw access to system calls, and so any signal configuration must be interposed given it's a shared process-wide resource, with the interposing code being a trusted intermediary, and thus can run with the full DDC (and, in the case of Morello, executive, but none of CheriBSD is designed to support the Morello-specific restricted mode currently, we make zero security claims, or even functionality ones, beyond it not being a side-channel between processes). There is no way that I know of to safely and robustly do it otherwise.
Also you need an alternate signal stack anyway if you want safety, otherwise compartments can poke at earlier stack frames to extract capabilities to other compartments.
Ah-ha, and it looks like we can use CHERI_PERM_SYSCALL
to experiment with that in CheriBSD.
I think a distinct stack region is usually required for the same reason that it is required when switching compartments in general. sigaltstack()
is a good way to get the kernel to go along with whatever stack policy the compartments rely upon. Here's a thought, though: you could use the same stack region but give a bounded stack pointer to enforce separation. It would require cleaning of memory on entry and exit but could act as a fallback for compatibility with code that doesn't use sigaltstack()
. This could probably be used as a deployment path, at least.
I’m not sure if you’ve looked at our 2015 paper on in-process compartmentalisation, but it touches on some of these topics -- and our implementation at the time had rather more to say about signal stacks, etc.
Since then we’ve been primarily focused on co-process compartmentalisation, which doesn’t require detailing fine-grained compartmentalisation within processes. But we also have ongoing work on a run-time-linker-based model that is turning our attention back to that general investigation.
@jacobbramley
Here's a thought, though: you could use the same stack region but give a bounded stack pointer to enforce separation.
I think this would end up functionally equivalent to the compartment manager/creater code calling sigaltstack
when a compartment is created?
@rwatson
But we also have ongoing work on a run-time-linker-based model that is turning our attention back to that general investigation.
Is there anything you can point us at? We're all ears :)
[This report was done in conjunction with @0152la and @jacobbramley]
In CheriBSD hybrid mode (presumably a variant of this can also happen in purecap mode, but I haven’t checked that), signal handlers can be used by a nefarious compartment to get access to a different DDC than it was registered with.
The following code shows the problem (much of this is boilerplate;
restrict_and_check()
is the main part of interest):When run this prints out:
As this shows, the first invocation of the signal handler is executed with the restricted DDC, and the second invocation with the unrestricted DDC. In essence, the signal handler has allowed the restricted compartment access to the unrestricted compartment. Using more general terminology, a signal handler can be used to gain access to a different set of permissions to that in play when the handler was registered.
The “obvious” fix is that registering a signal handler with
signal()
should record the DDC at registration time and restore that before the kernel invokes the signal handler. However, each DDC compartment will have its own stack, and no ABI I know of allows a user-space DDC compartment to record the stack at the point that the DDC value is changed. Thus, switching the DDC cannot be guaranteed to restore the stack pointer to the correct place. Changing the ABI to record the stack pointer on a DDC-switch would be very difficult and is probably impractical (if nothing else, how would the user atomically change the DDC and record the stack pointer?).One approach is to only deliver signals to a thread if its current DDC is the same as when the signal was registered. I’m not keen on this: signals may end up never being delivered, which will cause a debugging nightmare.
Fortunately I think we can make use of the existing
sigaltstack()
call, which allows a process to designate a given portion of memory as being the stack for signal calls. As well as recording the DDC at the point that a signal handler is registered, signal() should abort (probably returningSIG_ERR
and the seemingly genericEINVAL
inerrno
?) if an alternative signal stack has not been registered. Clearly this is not fully compatible with existing code which rarely callssigaltstack()
. One could safely loosen the restriction so that if the DDC at the pointsignal()
is called the “default” DDC no alternative signal stack needs to have been recorded.[Although I don’t think the following is directly related, in the sense that there’s nothing OS libraries or the kernel can do about it, it’s worth noting. Signal handlers are not deleted when a DDC compartment is removed, so they could be used for when a new DDC compartment happens to overlap in virtual memory with an old DDC compartment. A “good” DDC compartment manager thus will, in general, need to delete signal handlers that reside within a given DDC compartment when that compartment is deleted.]