riscv / riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.
https://jira.riscv.org/browse/RVG-148
Creative Commons Attribution 4.0 International
47 stars 26 forks source link

Differentiate "get capability value" and "get capability low half" #341

Open nwf-msr opened 1 month ago

nwf-msr commented 1 month ago

(On second thought, moving this out from #340)

There's a subtle assumption in most CHERI work to date that the address field of a capability is a full machine word and is one half of the CHERI capability. There are a few reasons that might not always be true.

See https://github.com/CTSRD-CHERI/cheri-specification/wiki/Tracking-discussion-for-CGetAddr-vs-CToPtr-vs-as-integer-alias-vs-CGetLow for (many) more but incomplete thoughts on untangling this knot. It might be nice if the base Zcheri specification took this possible division into account and reserved the relevant opcodes, even if most CHERI capability encodings will have identical behavior for some pairs.

jrtc27 commented 1 month ago

PAC and MTE, and all the various TBI things, expose the non-address portion in C as part of the address. Not doing so is unnecessary complication, and becomes problematic very quickly for uintptr_t code generation.

nwf-msr commented 1 month ago

But we can't permit CSetAddr to change the MTE tag, for example.

jrtc27 commented 1 month ago

uintptr_t needs to have full XLEN range for arithmetic, that's non-negotiable in the real world. That means the address needs to be the full XLEN bits.

davidchisnall commented 1 month ago

uintptr_t needs to have full XLEN range for arithmetic, that's non-negotiable in the real world. That means the address needs to be the full XLEN bits.

Not quite. uintptr_t needs to have a full XLEN range for arithmetic on untagged values. For tagged values, it does not have this range in practice because the capability bounds never permit it. The complexity in separating the get/set address and get/set low bits comes from the fact that address accessing needs to behave differently based on the tag bit, but that's a handful of gates.

jrtc27 commented 1 month ago

uintptr_t needs to have full XLEN range for arithmetic, that's non-negotiable in the real world. That means the address needs to be the full XLEN bits.

Not quite. uintptr_t needs to have a full XLEN range for arithmetic on untagged values. For tagged values, it does not have this range in practice because the capability bounds never permit it. The complexity in separating the get/set address and get/set low bits comes from the fact that address accessing needs to behave differently based on the tag bit, but that's a handful of gates.

I don't think a world where CGetAddr(CClearTag(foo)) != CGetAddr(foo) is a good idea. We've been down roads like that before and it's been porting pain.

davidchisnall commented 1 month ago

I don't think a world where CGetAddr(CClearTag(foo)) != CGetAddr(foo) is a good idea. We've been down roads like that before and it's been porting pain.

I'd like to see a concrete example of something that this would break. I've not encountered one yet, but I can believe it would exist.

jrtc27 commented 1 month ago

I don't think a world where CGetAddr(CClearTag(foo)) != CGetAddr(foo) is a good idea. We've been down roads like that before and it's been porting pain.

I'd like to see a concrete example of something that this would break. I've not encountered one yet, but I can believe it would exist.

T *old_p = p; /* or uintptr_t, as commonly done to silence use-after-realloc warnings, or ptraddr_t */
T *p = realloc(p, new_size)
/* update all the pointers in p based on their offset from old_p */ 

If revocation occurs during the update thanks to some other thread's free, the addresses seen for the pointers to the old allocation will change, because they'll no longer be tagged, giving incorrect offsets.

One can argue about what ISO C actually says about whether this is defined behaviour, but this is a real world pattern seen all over the place that would break if the address changes when the tag is cleared.

davidchisnall commented 1 month ago

I see. Do you have a reference to where you've seen that? I've heard people mention it but never actually seen it.

jrtc27 commented 1 month ago

I see. Do you have a reference to where you've seen that? I've heard people mention it but never actually seen it.

We found some in the desktop porting work that had the wrong provenance, so here are a few from that that come to mind. There are surely many others though, I remember seeing others, just not what they were in.

https://gitlab.freedesktop.org/xorg/xserver/-/blob/386b54fbe95711e6ecb5c23cfdbf25a1571acf7b/dix/privates.c#L203 https://gitlab.freedesktop.org/xorg/lib/libx11/-/blob/ed9fb5535efe1e5278654b6b3994a34337b4bf1a/src/xlibi18n/lcDB.c#L517 https://gitlab.freedesktop.org/xorg/lib/libx11/-/blob/ed9fb5535efe1e5278654b6b3994a34337b4bf1a/modules/im/ximcp/imLcPrs.c#L682

We're also often only aware of the ones that get the provenance wrong, the rest work fine and don't show up when porting.

davidchisnall commented 1 month ago

Thanks. I shouldn’t be surprised it’s xlib.