riscvarchive / riscv-code-size-reduction

https://jira.riscv.org/browse/RVG-122
150 stars 34 forks source link

Make the jump table readable not executable #195

Closed tariqkurd-repo closed 1 year ago

tariqkurd-repo commented 1 year ago

https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/349

To solve an ABI problem we need to change the permissions of the jump table so they are readable not executable.

Therefore the jump table permissions will be as if it accessed by a load instruction. This affects PMP checks and VM checks on the address.

My plan is to update the specification at the end of the Public Review period, as part of the feedback.

@aswaterman @kasanovic @jhauser-us tagging you to keep you informed.

Silabs-ArjanB commented 1 year ago

Note that the following part of the current Zc specification will also no longer be true:

Memory writes to the jump vector table require an instruction barrier (fence.i) to guarantee that they are visible to the instruction fetch.

Instead the CPU itself (not software) will have to guarantee that a store to the jump vector table will become visible to a possibly following table jump instruction. I don't think (but please correct me if I am wrong here) that we can require software to use regular fence instructions between a store to the jump vector table and table jump instructions.

aswaterman commented 1 year ago

Regardless of what we decide to do for the perms check, we still need to require the fence.i to enable straightforward incoherent caching of these table entries.

Silabs-ArjanB commented 1 year ago

@aswaterman Should I read the initial proposal as that the table fetch itself should be treated as load operation? If so, then I do not understand you remark about fence.i.

fence.i instructions provide synchronization between writes to instruction memory and instruction fetches, so I would think that if we treat the table fetch as a load operation, then fence.i instructions are not relevant. Because of the RVWMO model the hart should take care of proper execution order itself also not relying on regular fence instructions.

Could you please explain to me where my reasoning might be wrong? Also, if we are only talking about PMP permissions, then I think we might need to be more precise about the used language and spell out in detail how the relationship (if any) with RVWMO is.

aswaterman commented 1 year ago

I think this discussion conflates permissions checks with ordering checks. The proposed change is (or, at least, should be) to check perms as though the access were a read. That doesn’t inherently imply the access is ordered as a read.

Consider existing examples, like page-table walks: they’re perms-checked as reads, but they aren’t ordered with FENCE; they are ordered with SFENCE.VMA.

Regardless as to the formal details, we are being pragmatic here. Some microarchitectures cache instruction-side information incoherently, including this table. FENCE.I is the best tool to flush those structures. Flushing them on regular FENCE instructions would cause myriad unnecessary flushes. Introducing a new instruction is not pragmatic, either.

Silabs-ArjanB commented 1 year ago

@aswaterman Thank you for the explanation. I was indeed conflating permissions checks with ordering checks. At the same time, I think that if the Zc specifiation is updated, then this difference must be clear. Also I would assume that the mstatus.mprv bit would affect the permission checks of the table fetch, right?

sorear commented 1 year ago

Currently nothing in mstatus affects instruction fetch, since we got rid of supervisor execution from U=1 pages back in 1.11. With this change, mprv, mxr, and sum all potentially affect JVT table fetches, since they are treated as data loads. I am concerned that this proposal will cause simpler (but virtual memory capable) implementations to flush fetch and decode stages on all writes to mstatus and sstatus.

tariqkurd-repo commented 1 year ago

mxr adds execute permission to loads, so has no relevance with the proposal. if sum changes then user mode access from supervisor mode changes mprv means the privilege mode from mpp is used instead of the current mode

for sum, mprv and also for mode switching in general, any cached jump table entries also need to separately cache permissions for M/S/U modes for all VM pages which the table maps to (which I expect to be 1 in a real system, but it doesn't need to be 4K aligned so can straddle). This is also required for the current spec which needs eXecute permission. When switching mode the current mode must be correctly used for checking against the cached entries - but I don't see this is any different.

The only effect I see is that the cached jump table entries need to cache the Read permission for each mode, as oppose to the eXecute permission. The mode used for the check depends on the sum and mprv bits above, so the correct effective mode must be used - and the logic to return the effective mode for loads already exists.

Am I missing something - or does this address your concerns @sorear?

aswaterman commented 1 year ago

I think that if the Zc specifiation is updated, then this difference must be clear.

Yeah, if the proposed change is made, the spec needs to be clear on this point.

Also I would assume that the mstatus.mprv bit would affect the permission checks of the table fetch, right?

The mode used for the check depends on the sum and mprv bits above, so the correct effective mode must be used - and the logic to return the effective mode for loads already exists.

It isn't obvious to me that this should be the case. MPRV and SUM are defined to affect loads and stores. The read performed by the table jump is not a load, even if it is perms-checked as a read. (There exist other accesses that are perms-checked as reads but aren't considered loads, like page-table walks, so this isn't a new concept.)

The assumption is that most microarchitectures will perform table reads via their instruction-fetch units. Right now, the MPRV and SUM signals need not factor into an instruction-fetch unit's permissions checks. Forcing that to be the case only for the sake of consistency with loads might not be the best decision.

(Certainly, software should not care what we do here: table-jump instructions shouldn't be executed within MPRV or SUM sequences.)

sorear commented 1 year ago

@aswaterman What about MXR? Can jt use a jump table in an execute-only page if MXR=1?

As long as fence.i is required after jump table writes and mstatus.{MXR,SUM,MPRV} have no effect on jt execution, I have no major objections to the proposal.

Post-Meltdown, I can see people being very concerned about any cache reads if the information required to authorize said reads is not available until a later pipeline stage.

@tariqkurd-repo I'm having trouble understanding your last message - IIUC, there is no such thing as a "cached jump table entry" in any of the expected implementations, merely cached bytes in some byte-oriented cache (usually the I-cache, but it could be the D-cache in some implementations). If the point you were making is covered by @aswaterman's comment no action is needed.

aswaterman commented 1 year ago

@aswaterman What about MXR? Can jt use a jump table in an execute-only page if MXR=1?

I think the story should be the same for MXR as for MPRV and SUM.

tariqkurd-repo commented 1 year ago

OK - so read permission only, with no effect from MXR, MPRV and SUM.

My other comment was relating to caching to jump table entries, to improve performance - which is architecturally invisible.

aswaterman commented 1 year ago

To be clear, @tariqkurd-repo, I'm not advocating for this change--rather, I'm suggesting how it should be effected if we choose to do so. I've yet to be convinced we need to take any action.

tariqkurd-repo commented 1 year ago

the table remains executable