riscv / riscv-debug-spec

Working Draft of the RISC-V Debug Specification Standard
https://jira.riscv.org/browse/RVG-94
Other
454 stars 92 forks source link

`aarsize` field value for accessing `CSR_FRM` & `CSR_FFLAGS` registers #888

Open en-sc opened 1 year ago

en-sc commented 1 year ago

What is the correct value of aarsize when accessing CSR_FRM or CSR_FFLAGS ? As stated in the spec:

If aarsize specifies a size larger than the register’s actual size, then the access must fail.

For example, OpenOCD currently assumes size of these registers to be equal to XLEN. So, if the target has XLEN == 64 and only supports accesses to CSRs via abstract commands, it is not quite clear if these registers can be read.

pdonahue-ventana commented 1 year ago

This is probably a question for the unprivileged architecture. fcsr is specified as being 32b regardless of XLEN but there's no specified width for frm and fflags. I would assume that they're also 32b since it wouldn't make sense for a subset of a 32b register to be larger than 32b.

/cc @aswaterman

aswaterman commented 1 year ago

Seems reasonable to me.

en-sc commented 1 year ago

This is probably a question for the unprivileged architecture

It seems that according to the unprivileged spec accesses through CSRR* instructions are well-defined. Only the additional approach (abstract access command) to access these registers raises such question. It is unclear to me what is the benefit of exposing these registers via abstract access command. Maybe it should be stated, that the result of abstract access to these register address is undefined?

pdonahue-ventana commented 1 year ago

I agree that an external debugger should just use fcsr. While software might prefer to use frm and fflags so it can save a couple of instructions when manipulating those fields within fcsr, an external debugger still has to transfer 32b in or out for any of these CSRs and it's trivial for the debugger to do the appropriate field manipulation.

However, I think that it's better to keep things uniform and not introduce special cases for those two CSRs. That also eliminates the need for us to go through and identify what CSRs are useful and what CSRs are not useful (especially since the answer may change over time).

aswaterman commented 1 year ago

keep things uniform and not introduce special cases

Agreed this should be the guiding principle.

en-sc commented 1 year ago

I agree that an external debugger should just use fcsr

So lets assume a user of an external debugger wants to get the value of frm. Am I correct that you suggest the debugger to read this value through fcsr? Shouldn't this be at the very least hinted by something in the spec?

pdonahue-ventana commented 1 year ago

If I were personally writing a debugger, I would cache the register values locally (and clear the cache whenever the hart resumes). If the user asked for frm then I would read fcsr and populate the cache entries for fcsr and frm and fflags. Then if the user later asked for fflags, I would save time by reading the cache instead of doing another abstract command.

If an implementation allows abstract reads while the hart is not halted then this whole caching mechanism wouldn't work and there wouldn't be any benefit to my scheme. And other debugger writers might choose to either not do any caching or they may not care to do the optimization to save at most 2 abstract reads. The architecture allows people to do it either way.

timsifive commented 1 year ago

I think in the Debug Spec we can say that if a CSR does not have a documented size, then abstract access commands work for them like csrr and csrw do. That is they won't fail, and bits end up where they do.

pdonahue-ventana commented 1 year ago

I'm OK with that.

algrobman commented 9 months ago

why the hell only these few CSRs being defined as 32 bit? Why does this even matter, if only way to access the CSR are csrr* instructions, which defined as XLEN wide operations? Another thing, why the fields in some CSRs move based on XLEN : misa is an example (MXL) it would be easier to keep it in bits 31:30 so the register could be read with 32 or 64 bit debug accesses in RV64 .. In general, why some scarcely populated CSRs have bits everywhere, while it could be easier for SW if the implemented bits are located in LSBs .

pdonahue-ventana commented 9 months ago

These seem like questions for the unprivileged and/or privileged ISA committees. The debug spec has no power to change the definition of CSRs that are defined in the F extension (frm, fflags) or the privileged spec (misa).

en-sc commented 9 months ago

why the hell only these few CSRs being defined as 32 bit?

This is not exactly the case. My understanding is, fflags and frm are just defined as aliases to the corresponding fields of fcsr in the unprivileged spec. Unprivileged spec also defines instructions to accesses CSRs, and all the reads do zero-extend the value to XLEN bits. Therefore the size of fflags and frm is irrelevant and in not defined in the unprivileged spec.

Why does this even matter, if only way to access the CSR are csrr* instructions, which defined as XLEN wide operations?

This does not matter, if the csrr* instructions is the only way to access these registers. So the size of frm and fcsr is not defined. However, debug spec defines another method -- abstract access register command, that do behave differently in regard that there is no zero-extension going on. This is why the appropriate aarsize field value for these registers needs to be defined in the debug spec.

algrobman commented 9 months ago

Actually I meant all CSRs, defined as 32 bit registers : fcsr, dcsr,mvendorid, [ms]counteren, mcountinhibit, etc. Now seems because of register abstract command arrsize rules, DM needs a CSR decoder to fail these CSR accesses if accessed with aarsize=3 even for 64 bits CPU. Furthermore, debugger SW needs to know what CSRs are 32 bits too. Why? Just waste of gates and intellectual resources to implement all this .. All these "32 bit" CSRs is just RISCV spec inconsistency, without any good reason, (I guess it's just specs writers miss)
BTW, I think arrsize field is redundant , we use it only to check validity of the command, but rest of the data transfer logic is independent of it , since registers width is defined in implementation. In worse case data1 register will be unnecessarily updated, but debugger can still save time accessing only data0 if it needs only 32 LSBs . I heard about arrsize check usage to detect XLEN, but I believe , there should be better ways to accomplish this. (For ex, check number of dataN regs, or read misa)

en-sc commented 9 months ago

I heard about arrsize check usage to detect XLEN, but I believe , there should be better ways to accomplish this. (For ex, check number of dataN regs, or read misa)

Consider RV32...D -- the number of dataN would not help you to discover XLEN in this case if abstract access to FPU registers is supported.

As for reading misa, you see there is a couple problems:

  1. Reading misa can return zero (not implemented).
  2. Prior to RISCV Privileged spec v1.11, misa was a WIRI register:
    Some read-only and read/write registers have read-only fields reserved for future use. These reserved
    read-only fields should be ignored on a read. Writes to these fields have no effect, unless the whole
    CSR is read-only, in which case writes might raise an illegal instruction exception. These fields are
    labeled WIRI in the register descriptions.

    So you should have ignored misa[30:31] if you are reading a 64-bit wide misa.

algrobman commented 9 months ago

RV32ID CPU will need data0,1 registers only. So debugger can understand that XLEN=32, then it reads misa and figures out that it needs to use data0/1 to access FPRs and only data0 for all other registers. Regarding misa MXL field location - never could understand why they defined its different location for RV64 vs RV32 ...

abstactcs DM register has bunch of reserved bits - some of them could provide XLEN size , instead of this try and error procedure with writing "illegal" aarsize to command register.

pdonahue-ventana commented 9 months ago

RV32ID CPU will need data0,1 registers only.

I agree.

So debugger can understand that XLEN=32,

How?

then it reads misa and figures out that it needs to use data0/1 to access FPRs and only data0 for all other registers.

Not if misa is not implemented, as @en-sc pointed out.

abstactcs DM register has bunch of reserved bits - some of them could provide XLEN size

That would mean that the DM would need to know XLEN for each hart that it's attached to. That might be easy in your implementation where the DM is tightly coupled with the hart(s)[^1] but it would be a problem in large, heterogenous systems because the DM might not know much about the harts. Maybe this isn't a problem in practice since very few implementations support optional abstract accesses due to their inherent limitations (lack of support for V registers, abstract access's lack of support for fence.i after the debugger writes an ebreak to memory, these problems with aarsize and CSR accesses, etc.). However, I still think that it wouldn't be clean to require the DM to know about XLEN.

[^1]: I infer that you're tightly coupling this since you were talking about having your DM do CSR decode. A more general solution would have the DM send the command to the hart and the hart would reply to the DM with an error, which it would then put into cmderr.

algrobman commented 9 months ago

1) 2 data registers and memory abstract command mean 32 bit CPU. 2) 2 data registers and no memory abstract commands mean possibility of 64 bit CPU, failing 64 bit access to a GPR means 32 bit CPU.

3) For multihart system, selected hart could provide its current XLEN value to the DM register which I think, more cleaner solution, than rely on errors.

All multiple RISCV CPU versions, I was working recent years, implement abstract commands and not program buffer, which requires much more extensive verification, than abstract commands.

timsifive commented 9 months ago

@algrobman, you make some good arguments why aarsize is unnecessary, and we could instead have a different mechanism to determine XLEN. Unfortunately the spec is a few approvals short of being frozen and we can't make any changes right now.

I'm not sure what the right solution would be, but something more closely matching csrr and csrw semantics would make sense.