ThinkOpenly / sail

Sail architecture definition language
Other
11 stars 12 forks source link

[JSON] `slli` has incorrect information #18

Open ThinkOpenly opened 7 months ago

ThinkOpenly commented 7 months ago

The RISC-V Instruction Set Manual Volume I: Unprivileged ISA Document Version 20191213 shows SLLI opcode fields as: image

Yet, the JSON produced today has:

      "fields": [
        { "field": "0b010000", "size": 6 },
        { "field": "shamt", "size": 6 },
        { "field": "rs1", "size": 5 },
        { "field": "0b101", "size": 3 },
        { "field": "rd", "size": 5 },
        { "field": "0b0010011", "size": 7 }
      ],

Note that the first field should be 0b000000, and the 4th field should be 0b001.

vishisht-dubey commented 6 months ago

Hiii @ThinkOpenly can i work on this?

ThinkOpenly commented 6 months ago

@vishisht-dubey: @snapdgn posted a comment in Slack that he might have an idea about this. If you are interested, could you ask on #team-sailing-downstream Slack channel on the RVI Slack workspace?

vishisht-dubey commented 6 months ago

Hiii @ThinkOpenly can i work on this?

@vishisht-dubey: @snapdgn posted a comment in Slack that he might have an idea about this. If you are interested, could you ask on #team-sailing-downstream Slack channel on the RVI Slack workspace?

yeah sure, thanks

snapdgn commented 6 months ago

I did a little digging on this today. Upon further investigation, it appears that the issue we've encountered extends beyond just a couple of instructions (Some of which were already discussed in slack). In fact, a significant portion of instructions, such as the entire RTYPE format faces this problem. (There can be many more). image

from out extracted json, we can see (for the ADD instruction: rest of them suffers with the same problem)

"fields": [
    { "field": "0b0100000", "size": 7 },
    { "field": "rs2", "size": 5 },
    { "field": "rs1", "size": 5 },
    { "field": "0b101", "size": 3 },
    { "field": "rd", "size": 5 },
    { "field": "0b0110011", "size": 7 }
  ]

Notice the funct7 and funct3 fields.

snapdgn commented 6 months ago

I believe, the core issue lies with the way the mapping encdec's are being parsed now.

Taking an example of the "Shift Immediate", it is represented in the sail-riscv as :

mapping clause encdec = SHIFTIOP(shamt, rs1, rd, RISCV_SLLI) <-> 0b000000 @ shamt @ rs1 @ 0b001 @ rd @ 0b0010011 if sizeof(xlen) == 64 | shamt[5] == bitzero
mapping clause encdec = SHIFTIOP(shamt, rs1, rd, RISCV_SRLI) <-> 0b000000 @ shamt @ rs1 @ 0b101 @ rd @ 0b0010011 if sizeof(xlen) == 64 | shamt[5] == bitzero
mapping clause encdec = SHIFTIOP(shamt, rs1, rd, RISCV_SRAI) <-> 0b010000 @ shamt @ rs1 @ 0b101 @ rd @ 0b0010011 if sizeof(xlen) == 64 | shamt[5] == bitzero

which is then parsed and stored in the encodings hashtable like so:

SHIFTIOP:0b010000, shamt, rs1, 0b101, rd, 0b0010011
SHIFTIOP:0b000000, shamt, rs1, 0b101, rd, 0b0010011
SHIFTIOP:0b000000, shamt, rs1, 0b001, rd, 0b0010011

And while printing the json, they get randomly associated with either SLLI, SRLI or SRAI as they share a common key & there is no marker to distinguish between the different types.

I think we need some sort of extra marker in the encdecs or some extra information in the encodings Hashtable for proper association.

ThinkOpenly commented 6 months ago

Hmm. Thanks for explaining the problem so clearly. So, we have overloaded encdec identifiers, discriminated by enums of various types. This presents a bit of a challenge.

One thought is to determine the types of the parameters to the encdec mapping's left-side, then look for whichever one is an enum and use that as the actual identifier of the encdec. Although, for this to work, all of the related hashtables would have to use that as the key, and we don't even have that information when we first encounter the union clause ast. So, this approach is not ideal.

Perhaps we need to adjust the encodings table to allow for some indirection. Instead of each value in the table being a list, maybe it is instead a list of lists. (Brainstorming here.)

In the example above, SHIFTIOP(shamt, rs1, rd, RISCV_SLLI) could determine that RISCV_SLLI is an enum value, and use that as a secondary key. Then, the "SHIFTIOP" entry in the hashtable would look more like:

SHIFTIOP: [ [ RISCV_SLLI, 0b010000, shamt, rs1, 0b101, rd, 0b0010011 ],
            [ RISCV_SRLI, 0b000000, shamt, rs1, 0b101, rd, 0b0010011 ],
            [ RISCV_SRAI, 0b000000, shamt, rs1, 0b001, rd, 0b0010011 ] ]

"normal" instructions could use a null string or the primary key as the secondary key:

RISCV_JALR: [ [ RISCV_JALR, imm, rs1, 0b000, rd, 0b1100111 ] ]

Although, when we are emitting JSON, we have the primary key and the mnemonic, but we don't get the full set of mnemonics until after we're done parsing. Maybe we need to explode the mnemonics earlier? The mnemonic is the ultimate unique key.

snapdgn commented 6 months ago

The second option appears quite reasonable to me. I'll go ahead and experiment on it.