Open ltratt opened 3 years ago
There's a more general problem that our jemalloc isn't fully hardened against this kind of thing; you can free capabilities you shouldn't be able to. I think this is just one particular symptom of that known issue. We don't view jemalloc as being a particularly good allocator for CHERI so haven't invested time in hardening it to malicious users (whether deliberate or not).
See #342 for example; the only hardening we have is to check that it's tagged and offset zero.
The current jemalloc implementation does not make any serious attempts to address an adversary that steps outside the C language by using intrinsics to modify pointer permissions. The dlmalloc in the caprevoke
branch does store the original capability which allows it to require the the original capability is passed. I'm not sure if we actually do the check in realloc or only in free.
Hardening jemalloc is more complicated because there isn't an obvious place to store the original capability.
Stepping back a bit, I think there are three sensible responses to a realloc caller requesting non-monotonic behavior:
I’d tend to lean toward the latter two options because realloc failures are rarely handled properly by existing code and the semantics are a bit awkward (leaking ptr
is sufficiently common that FreeBSD added reallocf
). Abort/fault has the feel of a faulting ISA (e.g., CHERI-RISC-V) which is fine by me, but some people object strongly to the notation of library functions aborting. I find forcing copying semantics interesting as pragmatic choice.
The tricky thing for all of these is figuring out when to trigger the behavior. For permissions (#2) it’s easy enough to imaging malloc knowing the set of permissions for a new allocation and checking the permissions of the passed capability. For length it’s complicated because there are three lengths involved:
One way to implement this is to store a copy of the original, malloced capability as our modified dlmalloc does. This isn't always easy and isn't space efficient, but allows any comparisons you might want. Another potential solution is the anti-tamper seals discussed briefly in the Experimental Features and Instructions section of CHERI ISAv8. The latter would almost certainly require compiler changes to avoid optimization related issues.
Let me state my bias up-front: I am a fan of "abort on error" as a general principle so option 2 works for me!
I think the copying semantics you describe opens up a really fun attack. Imagine something like:
void f() {
void *arr = malloc(16);
pass_to_attacker(cheri_set_bounds(arr, 8));
}
void pass_to_attacker(void *arr2) {
arr2 = realloc(arr2, <something big>);
arr = malloc(16);
}
It's quite possible (in many mallocs, even "likely") that the malloc
in pass_to_attacker
will then use the same base pointer as in the original arr
malloc
. In other words, pass_to_attacker
has got a decent chance of getting a capability pointing to the same 16 bytes of memory as f
, despite being passed a capability with a bounds of 8.
As a side note, the example above also works if the attacker can simply free(arr2)
. So that suggests that free
probably has to do the same checks as realloc
.
Yeah, I agree, I don't think you can do anything sensible other than abort or return NULL, and the latter is likely to just crash anyway in a more obscure manner. If you do the copying approach then the old pointer is now freed so will eventually get revoked and the real owner of the allocation will get a surprise that it (legitimately) is not prepared to handle (since it never gave out the full capability and thus should be able to assume its allocation isn't going away unless it says so). That or you don't have revocation and then you have a nice use-after-free attack you can conduct.
I think abort on error is sensible, not least because we don’t know what idiomatic permission-aware CHERI C code looks like or wants to do yet. The check I suspect we want is a run-time assertion that the permissions (possibly other things) on a pointer returned by realloc()
never exceed those of the pointer passed in. If people writing permission-aware CHERI C code start finding this to be a problem, then we can decide if we want other semantics -- or perhaps a new API with those other semantics rather than overloading them onto realloc()
. It strikes me that we need to more clearly describe not just the spatial safety properties of realloc()
, but also the temporal ones, clearly in a document.
On the topic of “other things”, most likely realloc()
and free()
should both reject capabilities that are not “ordinary” data capabilities -- i.e., they should assert the tag, assert unsealed, and perhaps also assert that there are no surprising permissions that malloc()
itself would never return to a caller (VMMAN?).
(And when it comes to “specifying” (APIs) rather than “implementing” (code), I am not 100% certain which of the things I’ve described above should be specified behaviour vs. implementation choices. I suspect much should be specified, as we think this is part of the set of security properties we need, rather than simply robustness of the implementation against its own bugs.)
I think it might be useful to distinguish "equal" from "compatible" capabilities in this regard (my terms! I don't know if the capability literature defines similar/different terms).
I think we probably all agree that one way of solving the problem is to say that realloc
and free
only succeed if the user passes in a capability that is precisely equal to the last capability returned by malloc
or realloc
. What I think @rwatson is alluding to is whether one can weaken this to allowing in a capability compatible with the capability most recently returned by malloc
or realloc
.
The challenge then becomes defining what "compatible" means. One reason I'm a little nervous about it is because future CHERI implementations could define extra/different permissions that mean it's impossible to be robust against future changes -- those sort of problems keep me awake at night (but I am a light sleeper!). That makes me think that the it might be best to start with "capabilities must be equal", see what breaks, and if necessary fall back to "capabilities can be equal"?
Notably, for Morello, we probably want to ignore the "Flags" field (bits 63:56) in that comparison.
I disagree, not least because those are part of the value you see when doing pointer comparison or casting to an integer and so things already will break today if you pass something with different flags because that is not the same pointer, even if it describes the same address range etc.
C says the pointer has to "match" one returned by malloc/calloc/realloc. Different flags definitely don't match in my book, and I interpret match in the context of CHERI as "must be the same pointer as" (which, even for non-CHERI C, doesn't mean "compares equal to").
That is, I view Morello's flags field as analogous to the top byte tag bits in AArch64, which is specified as:
When tagged addressing is enabled, a tag is part of a pointer’s value for the purposes of pointer arithmetic. The result of subtracting or comparing two pointers with different tags is unspecified.
Oh, good point, and that behaviour is quite important in the presence of MTE. It probably does need to be highlighted if we define what we need by "compatible" somewhere because the flags are otherwise ignored for Morello.
Also on ‘equal’ vs ‘compatible’. Sub-object bounds complicate this because it becomes a reasonably likely event that bounds passed to free()
are not those originally allocated. Sub-object bounds complicate quite a few things, but they offer some real security advantage -- e.g., when taking a pointer to an array within a structure, which is then overflowed -- so it would be nice to find semantics for the various APIs that tolerates it, to the greatest extent possible.
On Thu, 19 Aug 2021 at 10:50, Robert N. M. Watson @.***> wrote:
Also on ‘equal’ vs ‘compatible’. Sub-object bounds complicate this because it becomes a reasonably likely event that bounds passed to free() are not those originally allocated.
really - why?
Sub-object bounds complicate quite a few things, but they offer some real security advantage -- e.g., when taking a pointer to an array within a structure, which is then overflowed -- so it would be nice to find semantics for the various APIs that tolerates it, to the greatest extent possible.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CTSRD-CHERI/cheribsd/issues/1065#issuecomment-901773902, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZVXCDFECH2PMW63CA3T5THXLANCNFSM5CMPVNDA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
Regarding permissions, there are use-cases where a caller might want to strip permissions, discarding the original so that undesirable accesses are provably impossible (without outside help at least). For example:
STORE
permissions entirely to lock the data.STORE_CAP
(etc) to guarantee that the memory can never be used to extend a compartment when used as a communications channel.I favour the strict approach (either aborting or returning NULL
) but with a strict implementation of malloc
/realloc
/free
, use-cases like those above become difficult to implement, and impose significant complexity on user code. I think there's room for a new, CHERI-specific allocation API here, though it's hard, at this stage, to consider everything that it might need to support. I at least foresee a use for something akin to mprotect
, so that free
can accept a capability with fewer permissions than malloc
originally provided.
On Thu, 19 Aug 2021 at 10:50, Robert N. M. Watson @.***> wrote: Also on ‘equal’ vs ‘compatible’. Sub-object bounds complicate this because it becomes a reasonably likely event that bounds passed to free() are not those originally allocated. really - why? …
I think Robert's point is that, without subobject bounds enabled in your compiler, the only way, other than pointer arithmetic (or UB things like copying to an unaligned address and clearing the tag), to get a capability that is derived from the allocation, points to the first byte of the allocation but does not compare exactly equal to the capability handed out from the allocator is by using various intrinsics. However, if subobject bounds are enabled, then you implicitly have various CSetBounds instructions inserted for you by the compiler just by performing address-of or array decay operations, and so programs that are a bit lax with their handling of pointers (e.g. struct foo { struct hdr h; int x; } ... bar(&foo->h); ... free(p);
works on existing architectures, and in CHERI C with subobject bounds not generated for that address-of, but would not work if the capability were bounded to sizeof(struct HDR)
).
Regarding permissions, there are use-cases where a caller might want to strip permissions, discarding the original so that undesirable accesses are provably impossible (without outside help at least). For example:
- Allocate, write some data, then discard
STORE
permissions entirely to lock the data.- Allocate, but drop
STORE_CAP
(etc) to guarantee that the memory can never be used to extend a compartment when used as a communications channel.I favour the strict approach (either aborting or returning
NULL
) but with a strict implementation ofmalloc
/realloc
/free
, use-cases like those above become difficult to implement, and impose significant complexity on user code. I think there's room for a new, CHERI-specific allocation API here, though it's hard, at this stage, to consider everything that it might need to support. I at least foresee a use for something akin tomprotect
, so thatfree
can accept a capability with fewer permissions thanmalloc
originally provided.
You'd need a way to revoke the store-bearing capabilities as they may well have been spilled to the stack behind your back and a read of uninitialised stack memory could alias one. Maybe stack temporal memory safety would be sufficient to ensure you can never read those, but I wouldn't like to say there isn't another way to leak and later be able to retrieve the original capability within the same trust boundary.
One idea is I had for an alternative API would be an allocator where you get both a capability and a token that you an use to free it (today I'd probably just seal the capability with a type allocated to the allocator) so you'd get an api like:
void *allocate(size_t, allocation_handle_t *);
void release(allocation_handle_t);
I do see @jrtc27's point that it's hard to provide strong assurances that the higher privilege capability is really gone.
For fun, I've put together a running example that shows that free
alone is enough to recover a capability. This is clearly more fragile than using realloc
, but this example runs successfully for me on Morello and RISC-V CheriBSD:
#include <assert.h>
#include <cheriintrin.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#if !defined(__CHERI_PURE_CAPABILITY__)
# error This example must be run on a CHERI purecap system
#endif
// On CheriBSD, if a capability's bounds include the base pointer to a
// `malloc`d block we can use `free` to recover the original capability. This
// is inherently fragile, and relies on the underlying malloc reusing memory
// (which CheriBSD's jemalloc currently does).
int main() {
// malloc returns a capability C1 to a block 0..n bytes long
uint8_t *c1 = malloc(16);
// Separate out the pointer from the capability so that we can check it
// later.
vaddr_t c1_addr = cheri_address_get(c1);
// Create a capability C2 with bounds 0..m where m < n
uint8_t *c2 = cheri_bounds_set(c1, 8);
c1 = NULL; // Be clear that we've lost access to C1.
assert(cheri_tag_get(c2) && cheri_length_get(c2) == 8);
// We first free C2...
free(c2);
// ...and then immediately allocate a block the same size as C1.
uint8_t *c3 = malloc(16);
// We get back a capability C3 that is identical to C1.
assert(cheri_tag_get(c3) && cheri_length_get(c3) == 16);
assert(cheri_address_get(c3) == c1_addr);
}
FWIW, if you make the bounds line:
uint8_t *c2 = cheri_bounds_set(c1 +8, 8);
you almost certainly corrupt the allocator state. I'm not convinced it's practical to defend jemalloc against this.
Yes, this attack certainly exploits jemalloc's approach to memory allocation.
One possibility would be to swap jemalloc for something like OpenBSD's malloc which goes out of its way to randomise things. It wouldn't make this attack theoretically impossible, but in practise you'd need the patience of a saint to make it work.
On the topic of permissions for free
, @LawrenceEsswood's CheriOS occupied a particularly interesting point in the design space: free
took an address, rather than a capability, on the grounds that full spatial and temporal safety meant that freeing everything in sight arbitrarily was, at worst, a DoS on the system. In order to provide a measure of availability in the face of such a stark position, CheriOS's malloc
offered a reference-counted, attributed claim/release system which could be used to defer frees until all claims were released (or all claimants had exited). I don't know whether this necessarily sheds light on whether a more traditional malloc, without refcounts, should require capabilities for free
or not, but it's an interesting extremal position to consider.
On a differently experimental note, these kinds of amplification attacks have been in scope for Cornucopia and its successors. We generally defend against them by not re-issuing address space until the freed object has had all non-TCB capabilities to it (i.e., held outside the kernel and memory allocator) revoked. Our implementations to date are imperfect, but we believe captures the majority of the costs that would be present in a hypothetically perfect implementation. You might want to take a look at https://github.com/CTSRD-CHERI/cheri-exercises/tree/master/src/exercises/pointer-revocation and its bigger brother https://github.com/CTSRD-CHERI/cheri-exercises/tree/master/src/missions/use-after-free-control-flow .
Yes, this attack certainly exploits jemalloc's approach to memory allocation.
One possibility would be to swap jemalloc for something like OpenBSD's malloc which goes out of its way to randomise things. It wouldn't make this attack theoretically impossible, but in practise you'd need the patience of a saint to make it work.
Randomising is a hack, and not a solution (for example, CheriBSD turns ASLR off for pure-capability binaries, since our threat model doesn't regard addresses as needing to be secret), especially in a world where you have a single malloc shared by multiple compartments. Experience says that attackers eventually find ways to exfiltrate the information they need or influence the randomness. If you're going to use a different malloc, use one that's designed for CHERI; e.g. snmalloc is trying to be (re)written with CHERI in mind.
Probably just to state something known to everyone on the three, but worth stating for the purposes of future readers: use-after-free is not currently within the threat model of the baseline CheriBSD jemalloc implementation, which provides spatial and not temporal safety. @brettferdosi may be able to comment on the applicability of his temporal safety wrapper, which uses Cornucopia and successors, and can wrap jemalloc, however?
If one has a capability that includes the base pointer returned by malloc, one can convince
realloc
to upgrade a less privileged capability. The first proof-of-concept is this very simple program which takes a capability with narrow bounds and "tricks"realloc
to upgrade it to a capability with wider bounds:We can strengthen this attack to turn a read-only capability into a read-write capability:
The root problem is that
realloc
doesn't fully validate the input capability. Exactly what "fully validate" means is perhaps an open question. My naive suggestion is that it might mean "realloc
will only accept a capability exactly equal to that returned by the lastmalloc
/realloc
for this base pointer", but that might be too restrictive, and one might need a notion of "compatible capabilities" or something else entirely...