riscv / riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.
https://jira.riscv.org/browse/RVG-148
Creative Commons Attribution 4.0 International
53 stars 29 forks source link

For discussion: MSR's CHERI+MTE composition #340

Open nwf-msr opened 3 months ago

nwf-msr commented 3 months ago

As mentioned in the meeting yesterday, Microsoft has spent some time considering the composition of CHERI and MTE, specifically as part of our work on CHERI heap temporal safety. Our design is geared towards supporting Cornucopia Reloaded's sweeping revocation while offering, among other things,

  1. decreased rate of quarantine buildup (inversely proportional to the number of MTE tag values),
  2. stronger security (closing the UAF/UAR distinction, as we did in CHERIoT), and
  3. lower software complexity (permitting safe in-band allocator metadata, also as on CHERIoT).

The key bits of this have been presented publicly before, but it'd be good to have it all here, too. We consider these proposed architectural semantics to be Capability Essential IP.

Please find attached some slideware (pptx with animations and extensive slide notes, or pdf without animations) with most of the details and (hopefully pretty) pictures. But, in quick summary:


[0] And, assuming we steal address bits for MTE tags, CSetAddr will need to clear tags if that tries to change the MTE tag. More general alternatives like #341 seem probably not palatable.

davidchisnall commented 3 months ago

Mismatching stores "fizzle" rather than trap: the store instruction may commit prior to knowing the memory metadata tag value, and mismatch results in the store being silently dropped without altering memory.

I think that we proposed this as an optional extension. Ideally, a core would support both stores-fizzle and stores-trap mode and would expose a counter of the number of fizzled stores. Running in stores-trap mode would come with a noticeable performance penalty (possibly as high as 20%), so would typically not be the default, but if a store has fizzled then it means that you have a store-after-free bug and probably want to fix it at some point. Typically, you'd want to use stores-trap mode during debugging. Systems with good telemetry infrastructure would want to run stores-trap as a sampling mode after detecting some threshold number of fizzled stores (could be one). If an app has store-after-free bugs, the OS would turn on the trapping mode, log stack traces (but still fizzle the stores by skipping the instruction prior to mret) to telemetry, and then resume.