CX custom instructions: need a way to indicate which rd/rs1/rs2 fields do or do not specify a register

grayresearch commented 6 months ago

Background

In the present spec, cx_reg instructions specify rd, rs1, rs2 fields to access dest reg and source regs. cx_imm specifies rd and rs1. cx_flex specifies rs1 and rs2 and explicitly does not update rd.

Since the CXU may be configured to receive not just the cf_id encoded in the custom instruction, but also the entire raw 32-bit custom instruction, a CX may, on an instruction-by-instruction basis, use the source register operands it receives, or ignore them and use the various 5-bit register ID fields to specify additional per CX instruction fields, and/or to select CX private register file entries, RAM entries, or channels.

However, CX's simple fixed encodings (invariant across CXs) can cause hard register dataflow dependencies (write-read dependencies and write-write anti-dependencies) that may impair instruction issue scheduling. For example, a cx_reg custom instruction cx_reg func,rd,rs1,rs2 may ignore the second source operand, X[rs2] and use the rs2 bits to select a CX-private register. Unfortunately the CPU does not know the instruction's second source operand is ignored, and so it must delay issue of the instruction until the latest value for X[rs2] becomes available. This impairs performance.

(One note in present spec reads: "One disadvantage of this approach: when the selected CXU routinely discards the R[rs1] or R[rs2] operands, use of the flex-type custom function instruction can create a useless false dependency on the rs1 and rs2 registers, which may uselessly delay issue of the CF instruction in an out-of-order CPU core.")

Instead we require a way to specify, per CX, per custom function instruction, per register operand, whether that instruction-specified CPU register value is used or ignored by the CX instruction.

One approach is to use some of the instruction opcode bits to indicate this. Another is to configure this, per custom instruction, per CX, using a CSR, perhaps using reserved fields in mcx_selector. Let's consider both and compare.

Fixed indication of use/non-use of a register specifier using CF_ID

Consider the spec'd custom-0 R-type instructions, which encode a 10-bit cf_id and rd, rs1, rs2 fields. Currently this accomodates a CX with 1024 custom instructions, all with a 2-source, 1-dest schema, and with hard write-read or write-write dependencies on all register specifiers (unless rd or rs1 or rs2 equals 0 (x0)). Of course when encoding rd/rs1/rs2 as 0, the CX custom instruction cannot use these fields for other custom purposes.

Instead, we might use cf_id value ranges to determine whether an instruction's given register field ID is used. For example:

  cf_id  | Type  | Dest  | Source | Source 2
 0 - 127 | RXDSS | X[rd] | X[rs1] | X[rs2]
128- 255 | RXDS  | X[rd] | X[rs1] | 0
256- 383 | RXD   | X[rd] | 0      | 0
384- 511 | RXSS  | -     | X[rs1] | X[rs2]
512- 639 | RXS   | -     | X[rs1] | 0
640- 767 | RX    | -     | 0      | 0

This provides 128 unique CF_IDs for each type (shape) of CX custom instruction. More CF_IDs can be made available across each register usage schema, if this kind of cf_id encoding is extended across custom-[0123] instead of just custom-0.

Also note, with some CF_ID value range "left over", we might also accommodate floating-point register based CX custom instructions. For example,

  cf_id  | Type  | Dest  | Source | Source 2
768- 831 | RFDSS | F[rd] | F[rs1] | F[rs2]
832- 895 | RFDS  | F[rd] | F[rs1] | 0
896- 959 | RFD   | F[rd] | 0      | 0
960-1023 | RFSS  | -     | F[rs1] | F[rs2]

Dynamic indication of use/non-use of custom instructions' register specifier fields using new mcx_selector fields

Another approach is to use some of the reserved mcx_selector bits to select, per CX, different aspects of CX custom instruction encoding/decoding.

For example, presently mcx_selector has 12 reserved bits. We could use of all of these reserved bits to determine, for example, 3-bits of encoding across each of the 4 different custom opcodes. For example, this might encode that instructions in custom-0 opcode block do not use rs2, or instructions in custom-2 opcode block do not use rd. (This is the the same as the current custom-2 cx_flex.) While this is more dynamically flexible (per CX selector) than the fixed CF)ID encoding scheme presented above, and it provides wider CF_ID ranges with a given register use/non-use pattern, it does not accomodate as many different pattern variants (above, 6-10, possible to add more).

Dynamic encodings makes custom instruction operands' decoding, and software debugging and modeling, dependent upon the current mcx_selector value, which may present additional challenges.

grayresearch commented 6 months ago

The "Fixed indication of use/non-use of a register specifier using CF_ID" proposal above is now spec'd as a non-normative proposal in this commit https://github.com/grayresearch/CX/commit/bd83b091f0025b47c918123aeaece5b978bd9837 in the 39-opt-reg-specs branch: https://github.com/grayresearch/CX/compare/39-opt-reg-specs and in section %2.3.1 of https://raw.githubusercontent.com/grayresearch/CX/39-opt-reg-specs/spec/spec.pdf. It's not perfect, surely it will be redesigned in the TG, but it does capture an approach for fixed format CX custom instructions to signal use/non-use of register specifier fields.

grayresearch commented 6 months ago

This spec update also incorporates a non-normative proposal to incorporate CX custom instructions which access the floating-point registers, also described in the first post in this Issue.

grayresearch / CX