Open grayresearch opened 7 months ago
An M-mode only system, e.g. austere bare metal MCUs running fully trusted firmware, should support M-mode only CX multiplexing. So option #2 should be: "CX multiplexing is only available in the least privileged implemented mode:
Oops, this proposal is broken / needs more work. It's unfortunate that none of you out there caught the problem! :-)
Assume we have a CX-aware OS, and when in S-mode it needs to e.g. save a CX state context. The spec (%2.6) states the OS uses standard stateful CX custom instructions: IStateContext::cf_read_status and ::cf_read_state instructions to access the state context of the current CX.
But, if we were to adopt mitigation # 2 above ("Each instruction issued in S-mode always and only uses legacy custom instructions and CSRs.") then the OS' S-mode context save code could not issue IStateContext::cf_read_status and ::cf_read_state. (Exercise: what happens if it does so anyway?)
So, back to the drawing board. I think we have
Scenario:
Consider three options:
Commentary:
Per-hart CX multiplexing is the current spec, and it is unacceptable. It should not (must not) be possible for U-mode code to change the behavior of e.g. S-mode code custom instructions. It is not realistic to explicitly reselect legacy custom instructions via a CSR write at every entry point from U-mode to S-mode.
Per-hart U-mode-only multiplexing closes this security hole. Each instruction issued in S-mode always and only uses legacy custom instructions and CSRs. This is simple and straightforward. Unfortunately, by defintion, this denies all S-mode OS kernel use of composable extensions. This is unfortunate because there are many kernel operations (e.g. data compression) that could make great use of CXs.
Extend the complement of CX multiplexing CSRs so there are disjoint CX mux CSRs for each priv level on a hart. In the scenario, the S-mode code uses the hart's current S-mode CX selection. This option closes the security hole AND enables an operating system to also use CX, at the cost of additional complexity and hardware resources.
Perhaps there are other options. Until we have a better idea or a fully elaborated option #3 design, I am inclined to update the spec with option #2. Your comments welcome.