Open ThinkOpenly opened 7 months ago
Hiii @ThinkOpenly can i work on this?
@vishisht-dubey: @snapdgn posted a comment in Slack that he might have an idea about this. If you are interested, could you ask on #team-sailing-downstream Slack channel on the RVI Slack workspace?
Hiii @ThinkOpenly can i work on this?
@vishisht-dubey: @snapdgn posted a comment in Slack that he might have an idea about this. If you are interested, could you ask on #team-sailing-downstream Slack channel on the RVI Slack workspace?
yeah sure, thanks
I did a little digging on this today. Upon further investigation, it appears that the issue we've encountered extends beyond just a couple of instructions (Some of which were already discussed in slack). In fact, a significant portion of instructions, such as the entire RTYPE format faces this problem. (There can be many more).
from out extracted json, we can see (for the ADD instruction: rest of them suffers with the same problem)
"fields": [
{ "field": "0b0100000", "size": 7 },
{ "field": "rs2", "size": 5 },
{ "field": "rs1", "size": 5 },
{ "field": "0b101", "size": 3 },
{ "field": "rd", "size": 5 },
{ "field": "0b0110011", "size": 7 }
]
Notice the funct7
and funct3
fields.
I believe, the core issue lies with the way the mapping encdec
's are being parsed now.
Taking an example of the "Shift Immediate", it is represented in the sail-riscv
as :
mapping clause encdec = SHIFTIOP(shamt, rs1, rd, RISCV_SLLI) <-> 0b000000 @ shamt @ rs1 @ 0b001 @ rd @ 0b0010011 if sizeof(xlen) == 64 | shamt[5] == bitzero
mapping clause encdec = SHIFTIOP(shamt, rs1, rd, RISCV_SRLI) <-> 0b000000 @ shamt @ rs1 @ 0b101 @ rd @ 0b0010011 if sizeof(xlen) == 64 | shamt[5] == bitzero
mapping clause encdec = SHIFTIOP(shamt, rs1, rd, RISCV_SRAI) <-> 0b010000 @ shamt @ rs1 @ 0b101 @ rd @ 0b0010011 if sizeof(xlen) == 64 | shamt[5] == bitzero
which is then parsed and stored in the encodings
hashtable like so:
SHIFTIOP:0b010000, shamt, rs1, 0b101, rd, 0b0010011
SHIFTIOP:0b000000, shamt, rs1, 0b101, rd, 0b0010011
SHIFTIOP:0b000000, shamt, rs1, 0b001, rd, 0b0010011
And while printing the json, they get randomly associated with either SLLI
, SRLI
or SRAI
as they share a common key & there is no marker to distinguish between the different types.
I think we need some sort of extra marker in the encdecs
or some extra information in the encodings
Hashtable for proper association.
Hmm. Thanks for explaining the problem so clearly. So, we have overloaded encdec
identifiers, discriminated by enums of various types. This presents a bit of a challenge.
One thought is to determine the types of the parameters to the encdec mapping's left-side, then look for whichever one is an enum and use that as the actual identifier of the encdec
. Although, for this to work, all of the related hashtables would have to use that as the key, and we don't even have that information when we first encounter the union clause ast
. So, this approach is not ideal.
Perhaps we need to adjust the encodings table to allow for some indirection. Instead of each value in the table being a list, maybe it is instead a list of lists. (Brainstorming here.)
In the example above, SHIFTIOP(shamt, rs1, rd, RISCV_SLLI)
could determine that RISCV_SLLI
is an enum value, and use that as a secondary key. Then, the "SHIFTIOP" entry in the hashtable would look more like:
SHIFTIOP: [ [ RISCV_SLLI, 0b010000, shamt, rs1, 0b101, rd, 0b0010011 ],
[ RISCV_SRLI, 0b000000, shamt, rs1, 0b101, rd, 0b0010011 ],
[ RISCV_SRAI, 0b000000, shamt, rs1, 0b001, rd, 0b0010011 ] ]
"normal" instructions could use a null string or the primary key as the secondary key:
RISCV_JALR: [ [ RISCV_JALR, imm, rs1, 0b000, rd, 0b1100111 ] ]
Although, when we are emitting JSON, we have the primary key and the mnemonic, but we don't get the full set of mnemonics until after we're done parsing. Maybe we need to explode the mnemonics earlier? The mnemonic is the ultimate unique key.
The second option appears quite reasonable to me. I'll go ahead and experiment on it.
The RISC-V Instruction Set Manual Volume I: Unprivileged ISA Document Version 20191213 shows SLLI opcode fields as:
Yet, the JSON produced today has:
Note that the first field should be
0b000000
, and the 4th field should be0b001
.