Open fpedd opened 3 years ago
Please check whether this instruction is actually executed, I am pretty confident it is (otherwise ETISS would complain, as you already noted). You can do that i.e. by using the PrintInstruction
plugin, or placing a breakpoint somewhere here: https://github.com/tum-ei-eda/etiss/blob/91fa3b3c7029241173380f99f8547b7ebef8b5cd/ArchImpl/RISCV/RISCVArch.cpp#L12140 and running ETISS with a debugger.
The instruction tree printing stuff has some issues, but usually these don't mean the decoder is not working. @rafzi might know more as to why the instruction tree prints do not work as expected.
Providing some more Infos:
Compiling the following main.c
with an rv32gcv
toolchain:
#include <stdlib.h>
#include <stdio.h>
int main()
{
asm("addi a1, a1, 1");
asm("c.addi a1, 1");
printf("hello world!\n");
}
and dumping the binary with riscv32-unknown-elf-objdump -h -S riscv_example.elf > riscv_example.lst
gives:
0000008c <main>:
#include <stdlib.h>
#include <stdio.h>
int main()
{
8c: 1141 addi sp,sp,-16
8e: c606 sw ra,12(sp)
90: c422 sw s0,8(sp)
92: 0800 addi s0,sp,16
asm("addi a1, a1, 1");
94: 0585 addi a1,a1,1
asm("c.addi a1, 1");
96: 0585 addi a1,a1,1
printf("hello world!\n");
98: 67b1 lui a5,0xc
9a: e6878513 addi a0,a5,-408 # be68 <__DTOR_END__+0x1a>
9e: 135010ef jal ra,19d2 <puts>
a2: 4781 li a5,0
}
a4: 853e mv a0,a5
a6: 40b2 lw ra,12(sp)
a8: 4422 lw s0,8(sp)
aa: 0141 addi sp,sp,16
ac: 8082 ret
One can see how most of the instructions are 16bit/compressed instructions (for some reason the human-readable instructions are not shown as compressed instructions). Because the assembler is responsible for converting "normal" instructions to compressed instructions (of course only when compressed support is available), also the addi
inline assembly instruction gets converted to its compressed equivalent (address 0x94
).
Running this with the PrintInstruction
plugin enabled gives:
...
0x000000000000008c: c.addi # 0x0x1141 [UNKNOWN PARAMETERS]
0x000000000000008e: c.swsp # 0x0xc606 [UNKNOWN PARAMETERS]
0x0000000000000090: c.swsp # 0x0xc422 [UNKNOWN PARAMETERS]
0x0000000000000092: c.addi4spn # 0x0x0800 [UNKNOWN PARAMETERS]
0x0000000000000094: c.addi # 0x0x0585 [UNKNOWN PARAMETERS]
0x0000000000000096: c.addi # 0x0x0585 [UNKNOWN PARAMETERS]
0x0000000000000098: c.lui # 0x0x67b1 [UNKNOWN PARAMETERS]
0x000000000000009a: addi # 0x0xe6878513 [UNKNOWN PARAMETERS]
0x000000000000009e: jal # 0x0x135010ef [UNKNOWN PARAMETERS]
...
I also set a breakpoint using the target gdb in at one of the inline addi
instructions and checked the dereferenced instruction pointer, which supports the claim that indeed a "compressed add immediate" is executed:
(gdb) x $pc
0x94 <main+8>: 0x05850585
With the CoreDSL for c.addi
instruction:
C.ADDI {
encoding:b000 | imm[5:5]s | rs1[4:0] | imm[4:0]s | b01;
args_disass: "{name(rs1)}, {imm:#05x}";
X[rs1] <= X[rs1]'s + imm;
}
the 0x0585
-> 0b 0000 0101 1000 0101
-> 0b 000 0 01011 0001 01
matches the c.addi
instruction with register a1
-> x11
-> 0b01011
and 1
as immediate value.
So I am fairly certain that a "compressed add immediate" is executed.
Coming back to the instruction tree and using the encoding from above b000 | imm[5:5]s | rs1[4:0] | imm[4:0]s | b01
@0x0 Node[1:0]
@0x0 Node[15:13]
@0x0 Uninitialized Node
@0x1 Instruction: c.fld
@0x2 Instruction: c.lw
@0x3 Instruction: c.flw
@0x5 Instruction: c.fsd
@0x6 Instruction: c.sw
@0x7 Instruction: c.fsw
@0x1 Node[15:13]
@0x0 Uninitialized Node <-----
the c.addi
however seems to be uninitialized.
When uncommenting https://github.com/tum-ei-eda/etiss/blob/4c3631391c81ef49d292e030c09ffd085cb3c70c/ArchImpl/RISCV/RISCVArchSpecificImp.h#L386 the tree structure of the instruction set/instruction decoder gets printed.
However, some nodes in the compressed instruction set tree are printed as "uninitialized" (arrows
<-----
inserted by me):A node gets printed as uninitialized when this condition evaluates to false: https://github.com/tum-ei-eda/etiss/blob/4c3631391c81ef49d292e030c09ffd085cb3c70c/src/Instruction.cpp#L739
The second uninitialized node
corresponds to the
c.addi
instruction. Checking the binary I am compiling (using an rv32gc compiler) this instruction gets used multiple times. I would thus expect the binary to throw some sort of error. However, the binary using these "uninitialized instructions" runs without any issues.What is happening here? Why are those nodes printed as uninitialized? Why does the binary run anyways? Any help is appreciated! :)
PS: I am mainly asking because I am working on something else, where some instructions/nodes of a RISC-V instruction set extension are also printed as "uninitialized". However, those uninitialized instructions cause some trouble and I am trying to understand why and where the underlying issue is.