riscv / riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.
https://jira.riscv.org/browse/RVG-148
Creative Commons Attribution 4.0 International
47 stars 28 forks source link

Proposal: Remove 32-bit support from the v1 Zcheri standards #294

Closed davidchisnall closed 3 days ago

davidchisnall commented 3 months ago

Since 2010, almost all of the research on CHERI has revolved around 64-bit architectures: CHERI MIPS, Morello, and CHERI RISC-V prototypes. Morello, as a superset architecture with a (relatively) high-performance implementation provided the most useful feedback and has over 100 MLoC running on it in pure-capability mode. We are now in a position where we can quite confidently make some claims about the minimum necessary subset of CHERI for some useful properties on 64-bit systems.

RISC-V is not a research architecture

RISC-V is explicitly not a research ISA. It is an ISA that can enable research, but the official ISA specification is intended to embody only validated research outputs. 64-bit CHERI meets this requirement. I believe CHERIoT does specifically in the context of embedded systems, but not in the context of application cores, where the constraints are different.

The non-CHERIoT 32-bit CHERI software story is incomplete

64-bit CHERI has a mature port FreeBSD port, at least three Linux ports of varying levels of maturity and different approaches, ports of FreeRTOS, and several clean-slate operating systems. CHERIoT has a clean-slate OS that was co-designed with the ISA and a (not production quality) port of FreeRTOS. For a 32-bit CHERI, there is no port of any *NIX system and there are a number of technical issues for creating one.

32-bit and 64-bit CHERI will be unlikely to coexist in a single processor

RV32 is a subset of RV64, in part, to allow RV64 cores to run RV32 binaries. This approach makes it easy for a single core to run 32- and 64-bit programs and even operating systems. It makes it easy for a 64-bit OS to provide a 32-bit compatibility layer to run 32-bit programs.

A 64-bit CHERI core will require a tag bit per 128 bits of memory. A 32-bit CHERI core requires a tag bit per 64 bits of memory. A system that supports both has five states for a 128-bit word:

This does not fit in two tag bits and, because 5 and 2 are both prime, there is no convenient power of two that can fit these. A 64-byte cache line supports 5^8 states, so it is possible to imagine a memory subsystem with a 19 bits of tag per cache line and some wasted space, but this is more than twice the space overhead of the 8 tag bits required for a pure 64-bit system and comes with a lot of complexity. Similarly, it is possible to imagine a system with a 2-bit tag, where the 11 state is either two 64-bit capabilities or one 128-bit one depending on a page-table permission, which would allow 32-bit and 64-bit CHERI processes to coexist (without sharing pages that contain pointers), but this would have software implications on a kernel and would need to be the subject of more research.

As such, there is no reason to align 32- and 64-bit specifications strongly and this kind of interop is already an explicit non-goal of the Zcheri specification. There is no legacy 32-bit pure-capability software that we need to support on 64-bit systems.

32-bit requires a new page-table format to run *NIX

All of the existing *NIX work relies on being able to restrict which pages can store capabilities. For example, memory-mapped files may not be able to persist capabilities and so it would be very confusing if capabilities silently became untagged when a memory-mapped page left and reentered the buffer cache at unspecified points. Similarly, shared-memory pages between disjoint address spaces may not share capabilities without allowing one process to enable the other to violate its internal memory-safety guarantees.

The Sv32 page table specification uses all of its bits in the base specification and so cannot express this.

It might (future research needed) be possible to use capability permissions for this, but only if the virtual memory subsystem had the ability to revoke capabilities when memory is remapped (see below).

32-bit requires a new page-table format for temporal safety with an MMU

Revocation on systems with MMUs requires at least two page-table bits to track the revoker state. The lack of free PTE bits in Sv32 makes this impossible. This means that there is no clear path to temporal safety on 32-bit systems with an MMU.

32-bit cores are specialised

Any ISA will, either explicitly or implicitly, favour implementations at different scales. Arm's LDM and STM instructions, for an extreme example, are easy to implement on simple in-order pipelines with no MMU but become incredibly painful in the general case on large superscalar pipelines, especially when they can fault in the middle.

The target for a 64-bit CHERI spec is likely to be application cores that, at least, support register renaming, will typically have long pipelines (5+, often 10+ stages), and will mostly be superscalar out-of-order implementations. A 32-bit specification that supports application cores needs to scale down to 2–3-stage in-order implementations and up to 10+ stage out-of-order implementations.

With CHERIoT, we intentionally reduced the scope:

This is fine for a microcontroller running an RTOS. It does not lead to the same set of choices that you would make for something that wanted to compete in the same space as a Cortex A15 running Linux. For example:

These are all good design choices for something that aims to support microcontrollers and an RTOS. They are not the right choice for a superscalar pipeline running Linux. Aiming to provide a 32-bit specification that can do both will cause a lot of complexity (which directly translates to DV costs), without a clear benefit (since there is no mature software stack for the proposed 32-bit CHERI spec).

andresag01 commented 1 month ago

@davidchisnall @arichardson : We had more discussions since this ticket was created. As far as I understand, the plan is to keep the 32-bit CHERI RISC-V support, but modify the current spec to allow extensions for alternative capability encodings, etc (like CHERIoT).

Can we close this ticket?

tariqkurd-repo commented 3 weeks ago

@davidchisnall can we close this as we are aligning to CHERioT?

andresag01 commented 3 days ago

There has been a lot of work around this are since the ticket was open. The general idea is to maintain the 32-bit support in a fashion that easily allows extending CHERI RISC-V with CHERIoT features (including a new capability encoding).

There were other good points raised in this ticket, such as not having must software ported to CHERI+RV32+MMU and the lack of spare PTEs in RV32. These will eventually need to be addressed since we are keeping RV32+CHERI support for now. Also, there has not been activity in this ticket for sometime, so I will close it. Please feel free to re-open it if you'd like to continue the discussion