riscv / riscv-control-transfer-records

This repo contains a RISC-V ISA extension (proposal) to allow recording of control transfer history to on-chip registers, to support usages associated with profiling and debug.
https://jira.riscv.org/browse/RVG-62
Creative Commons Attribution 4.0 International
14 stars 4 forks source link

John Hauser feedback 1 #11

Closed bcstrongx closed 5 months ago

bcstrongx commented 5 months ago

1. As the RISC-V ISA and all ratified extensions are currently defined, if S-mode isn't implemented, no S-level CSRs exist.

This draft extension requires an S-level CSR, sctrstatus, yet claims not to require that S-mode be implemented. The Architecture Review Committee discussed the matter and concluded it leans against having sctrstatus be the first (and only?) S-level CSR that may exist without S-mode.

The ARC asks the TG to consider whether Smctr/Ssctr should have a dependence on S-mode, much as it already requires XLEN >= 64 in order for the extension to be fully controllable through the 64-bit *ctrctl registers.

If the TG is sure Smctr/Ssctr is needed for 64-bit RISC-V processors without S-mode (though not 32-bit ones, evidently), then it appears it will be necessary to add an mctrstatus CSR that aliases sctrstatus.

2. The width of CSR sctrstatus isn't stated.

To avoid unnecessary complexity for a possible future RV32 version, I recommend 32 bits, unless the TG has reason to believe it needs to be more.

3. In the *ctrctl registers, the order of bits U, S, and M, and the order of STE and MTE, is backwards from other RV CSRs.

Assuming there is a desire to keep MTE and STE lined up with M and S, then for better RV consistency, presumably bits STE and S should remain where they are, bits U and M swapped, and MTE moved to bit 10.

4. Within CTRs, for both ctrsource and ctrtarget, we have:

It need not be able to hold any invalid addresses; implementations
may convert an invalid address into a valid address that the
register is capable of holding.

The RISC-V *tval registers serve a similar purpose on memory access traps, and the Privileged ISA says of mtval:

... it is a WARL register that must be able to hold all valid
virtual addresses and the value zero. It need not be capable of
holding all possible invalid addresses. Prior to writing mtval,
implementations may convert an invalid address into some other
invalid address that mtval is capable of holding.

Is the TG absolutely sure that ctrsource and ctrtarget don't need to support at least one invalid address?

5. Section 6.1.2, "External Traps", says:

External trap recording depends not only on the target mode, but
on any intervening modes, which are modes that are more privileged
than the source mode but less privileged than the target mode.  Not
only must the external trap enable bit for the target mode be set,
but the external trap enable bit(s) for any intervening modes must
also be set.

This seems wrong to me. Hence, I believe the document deserves an explanation for why it's actually desirable.

6. According to the chapter "Custom Extensions", the value of the Custom field in each *ctrctl register is reset to an implementation-specific "default value", which configures the extension for its standard- defined behavior. Values of the Custom fields other than this default value select custom behavior.

The ARC strongly believes that a value of zero in a Custom field should select standard-defined behavior. Software can then know that a value of zero gives standard behavior, rather than some implementation- specific value (the "default") that may or may not be known to programmers. If it is even necessary to initialize these fields at reset (existing RISC-V convention usually leaves initialization to initial boot code whenever possible), they will of course reset to zero.

bcstrongx commented 5 months ago

7. Some of the inhibit bits in *ctrctl are documented thus:

INDJUMPINH   Inhibit recording of indirect jumps.
DIRJUMPINH   Inhibit recording of direct jumps.

INDOJMPINH   Inhibit recording of other indirect jumps.
DIROJMPINH   Inhibit recording of other direct jumps.

To most readers, it's going to be inscrutable how, next to the category of "indirect jumps", there could be any "other indirect jumps" not covered by the first category. The same for "direct jumps" and "other direct jumps". Furthermore, the catch-all categories of "indirect jumps" and "direct jumps" would presumably subsume many of the other categories: "indirect calls", "direct calls", "co-routine swaps", etc.

So by plain reading of the document, it would seem that setting both bits INDJUMPINH and DIRJUMPINH should prevent all JAL and JALR instructions (including their compressed forms, C.J etc.) from being recorded. In this case, we can infer that the other inhibit bits for jumps (e.g. for subroutine calls and returns) become irrelevant and ignored.

The same issue affects the description of the TYPE subfield of ctrdata. Some of the documented values are:

1010 - Indirect jump
1011 - Direct jump

1110 - Other indirect jump
1111 - Other direct jump

It's impossible to divine what exactly distinguishes binary code 1010 from 1110, and 1011 from 1111.

After referring to the E-trace document, I believe what Smctr/Ssctr says is not what is actually intended. If my understanding is correct (and please correct me if I'm wrong), the labels for the jump categories for ctrdata.TYPE would more accurately be:

1010 - Other indirect jump without linkage
1011 - Direct jump without linkage

1110 - Other indirect jump with linkage
1111 - Other direct jump with linkage

By linkage here I mean that the jump instruction has a destination register that is not x0. (So a jump "without linkage" has no destination register other than the zero-only register x0.)

Assuming that the inhibit bits in *ctrctl are supposed to match the types recorded in ctrdata.TYPE (and I believe they should), then they would be:

bit 42   Inhibit recording of other indirect jumps without linkage.
bit 43   Inhibit recording of direct jumps without linkage.

bit 46   Inhibit recording of other indirect jumps with linkage.
bit 47   Inhibit recording of other direct jumps with linkage.

One way or another, the labelling for these categories needs to be fixed in the document. This implies also that some of the official subfield names (INDJUMPINH etc.) ought to change, since they currently mimic the misleading labels.

The Architecture Review Committee agreed it would be better for the Smctr/Ssctr document to provide its own sufficiently detailed descriptions of each control-transfer category and not just defer to the E-trace document for this information. This may entail some cut- and-paste from the E-trace document. Some statement that the two are intended to be aligned would be appropriate.

However, to the extent that the E-trace document is the original source for some of the mislabelling I identified, it still needs to get corrected in the Smctr/Ssctr document in the process.

bcstrongx commented 5 months ago

8. The DEPTH field in mctrctl is specified as WARL, and the document says, "It is implementation-specific which DEPTH values are supported."

At the same time, we have:

An implementation may opt to hardcode some or all of the bits in
this field.

and:

All implemented fields are writable, though DEPTH may contain some
read-only bits.

All this talk about read-only bits in DEPTH calls into question whether an implementation has complete freedom to decide what DEPTH values are supported or is limited to choosing which DEPTH bits are read-only and which are fully writable. The latter version is obviously more constraining.

I think it should be sufficient to label DEPTH as WARL and stop after saying "It is implementation-specific which DEPTH values are supported." The remarks about read-only bits in DEPTH only confuse the matter and should be deleted.

9. The "Entry Registers" chapter says:

The standard indirect register access rules specified by Smcsrind/
Sscsrind apply for CTR.  S-mode is able to access CTR entries using
the siselect/sireg* interface, with the same behavior described
for M-mode above.  Similarly, VS-mode is able to access CTR entries
using siselect (really vsiselect) and sireg* (really vsireg*).

The way this paragraph is written (along with the rest of the document), it appears the reader is expected to understand that the register state accessed through sireg when siselect is in the range 0x200-0x2FF, or through vsireg when vsiselect is in the same range, is exactly the same state as accessed through mireg* when miselect has the same register number.

Actually, the intention for Smcsrind/Sscsrind is that each level, M, S, and VS, has a separate space of indirectly accessed registers. There should not be an inherent assumption that the different privilege levels all access shared state. In fact, the original use for the indirect register mechanism was the AIA, where the state accessed at each level is explicitly not the same. (Interrupt files and interrupt priority arrays at M level are distinct from those at S level, and distinct again from interrupt files at VS level.)

Because it's not an appropriate assumption, it should be more clearly specified that the S-level CTR entries are actually aliases of the M-level CTR entries, and likewise at VS level.

10. The chapter "State Enable Access Control" says:

If the H extension is implemented and mstateen0.CTR=1, the
hstateen0.CTR bit controls access to supervisor CTR state
(sctrcontrol, sctrstatus, and sireg* when siselect is in
0x200..0x2FF) when V=1.

Should be "when vsiselect is in 0x200..0x2FF".

11. In Section 6.1.1, "Virtualization Mode Transitions", we have:

mctrctl.M, sctrctl.{S,U}, and vsctrctl.{S,U} are used to determine
whether the source and target modes are enabled;

Because U-mode can never be the source or target of a virtualization mode transition, you could leave out bit U of HS-level sctrctl.

12. Also in Section 6.1.1:

sctrctl.LCOFIFRZ and sctrctl.BPFRZ determine whether CTR is frozen

Do you mean "is (already) frozen" or rather "becomes frozen"?

13. Section 6.4, "RAS (Return Address Stack) Emulation Mode", says:

Function returns pop the most recent call, by invalidating entry 0
(setting ctrsource.V=0) and rotating the CTR buffer, ....

Don't rotate the CTR buffer, decrement sctrstatus.WRPTR.

bcstrongx commented 5 months ago

Resolution:

  1. Will discuss in TG whether requiring S-mode to be implemented is acceptable. Otherwise will add mctrstatus CSR.
  2. Will fix
  3. Will fix
  4. Agreed that CTR entry registers do not need to hold invalid addresses. Because registers like epc and tval hold invalid addresses, software bugs that result in accesses to invalid addresses can be debugged without adding storage cost to CTR.
  5. Will add a non-normative statement to the spec explaining the value of requiring intervening modes to be enabled for external traps.
  6. Agreed that custom bits should require that 0 implies standard/conforming behavior, but there should be no requirement on the reset value of these (or any) bits. Will update the spec accordingly.
  7. Will add table with map of transfer type encodings to transfer type names, and a table that maps transfer type names to opcodes. Also will add use of "with/without linkage" terminology and alter the field names accordingly. 8 - 13. Will fix
bcstrongx commented 5 months ago

TG agreed that making S*ctr dependent on Supervisor mode is acceptable. Also reached out to priv arch list, no concerns expressed.