RoaLogic / RV12

RISC-V CPU Core
Other
280 stars 50 forks source link

Potential bug with consecutive LOAD instructions #27

Closed PauliusMorku closed 2 years ago

PauliusMorku commented 3 years ago

Hello, I am currently working with an unconventional formal verification approach and used your open source RV12 RISC-V processor as a use case. It appears that my approach discovered some potentially severe bug within the pipeline hazard handling logic. Stalling logic misbehaves if two consecutive LOAD instructions are issued and there is a dependency on the contents of the first LOAD. This happens because else if (ex_opcode == OPC_LOAD && !ex_bubble) is not always checked if else if (id_opcode == OPC_LOAD && !id_bubble) is triggered (which can happen when two consecutive LOAD instructions are issued). Here is the code I'm refering to (riscv_id.sv): image Fixed code can be found in my forked repo: https://github.com/PauliusMorku/RV12

Best regards, Paulius Morkunas

rherveille commented 3 years ago

Do you have a testcase that triggers the bug?

Richard

Richard Herveille

Managing Director

Phone +31 (45) 405 5681

Cell +31 (6) 5207 2230

@.***

On 10/05/2021, 15:43, "PauliusMorku" @.***> wrote:

Hello, I am currently working with an unconventional formal verification approach and used your open source RV12 RISC-V processor as a use case. It appears that my approach discovered some potentially severe bug within the pipeline hazard handling logic. Stalling logic misbehaves if two consecutive LOAD instructions are issued and there is a dependency on the contents of the first LOAD. This happens because else if (ex_opcode == OPC_LOAD && !ex_bubble) is not always checked if else if (id_opcode == OPC_LOAD && !id_bubble) is triggered (which can happen when two consecutive LOAD instructions are issued). Here is the code I'm refering to (riscv_id.sv):

Fixed code can be found in my forked repo: https://github.com/PauliusMorku/RV12

Best regards, Paulius Morkunas

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

PauliusMorku commented 3 years ago

Since I am using formal approach it is a little bit tricky to extract exact trigger testcase. However, try the following sequence of instructions, it should uncover the bug:

addi a0,zero,128 // Store any arbitrary value in memory sw a0,0(zero) // Store any arbitrary value in memory lw a1,0(zero) // Load stored value to a1 lw a2,0(zero) // Perform any unrelated load that does not trigger any hazard // Use value from a1 that was recently loaded (according to my observations a1 is not resolved correctly due to failing stall logic) addi a3,a1,1 // Use value in a1 (any instruction that uses RS1 should be affected)

Regards, Paulius

PauliusMorku commented 2 years ago

Hello again, I wrote a test program that demonstrates the bug and is compatible with the existing testbench. I also found that the testbench had a bug with RAM region definition that caused all existing RISC-V load/store tests to fail (this also needs to be fixed in order to run my test). This can be easily done by replacing line 206 in testbench_top.sv assign pma_adr[3] = 1 << 31; with assign pma_adr[3] = 30'h3fffffff;

The following parameters were used for the simulation:

DCACHE_SIZE=0
ICACHE_SIZE=0
MEM_LATENCY=0
WRITEBUFFER_SIZE=0
XLEN=32
MULT_LATENCY=0
HAS_U=0
HAS_S=0
HAS_H=0
HAS_RVA=0
HAS_RVC=0
HAS_RVM=0

Assembly for the test:

.globl _start
_start:
li t0,0x80400000
li t1,2
sw t1,0(t0)
lw t2,0(t0)
lw t3,0(t0)
addi t4,t2,-1 # should be euqal to 1
li t5,0x80001000
sw t4,0(t5)

Compiled IHEX that can be fetched to the testbench:

:0200000480007A
:10000000B70240801303200023A0620083A30200F4
:1000100003AE0200938EF3FF371F00802320DF0121
:040000058000000077
:00000001FF

The resulting status PASSED/FAILED indicates if the double load bug is fixed. Modified RV12 code with a fix can be found in my forked repo: https://github.com/PauliusMorku/RV12

Regards, Paulius Morkunas

rherveille commented 2 years ago

Very timely. I am working on the CPU.

Let me add this.

BTW, can you point me to the fix? I can’t see it in your repo.

Thanks,

Richard

Richard Herveille

Managing Director

Phone +31 (45) 405 5681

Cell +31 (6) 5207 2230

@.***

On 11/10/2021, 18:56, "PauliusMorku" @.***> wrote:

Hello again, I wrote a test program that demonstrates the bug and is compatible with the existing testbench. I also found that the testbench had a bug with RAM region definition that caused all existing RISC-V tests to fail (this also needs to be fixed in order to run my test). This can be easily done by replacing line 206 in testbench_top.sv assign pma_adr[3] = 1 << 31; with assign pma_adr[3] = 30'h3fffffff;

The following parameters were used for the simulation: DCACHE_SIZE=0 ICACHE_SIZE=0 MEM_LATENCY=0 WRITEBUFFER_SIZE=0 XLEN=32 MULT_LATENCY=0 HAS_U=0 HAS_S=0 HAS_H=0 HAS_RVA=0 HAS_RVC=0 HAS_RVM=0 Assembly for the test: .globl _start _start: li t0,0x80400000 li t1,2 sw t1,0(t0) lw t2,0(t0) lw t3,0(t0) addi t4,t2,-1 # should be euqal to 1 li t5,0x80001000 sw t4,0(t5) Compiled IHEX that can be fetched to the testbench: :0200000480007A :10000000B70240801303200023A0620083A30200F4 :1000100003AE0200938EF3FF371F00802320DF0121 :040000058000000077 :00000001FF The resulting status PASSED/FAILED indicates if the double load bug is fixed. Modified RV12 code with a fix can be found in my forked repo: https://github.com/PauliusMorku/RV12

Regards, Paulius Morkunas

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

PauliusMorku commented 2 years ago

You should find a definition define FIXED_VERSION in RV12/rtl/verilog/core/riscv_id.sv. If you comment the definition out then it uses original buggy code and if you uncomment it then the fixed code is used. The whole fix itself is contained in riscv_id.sv too, just follow the usage of this define within the file. Regards, Paulius

rherveille commented 2 years ago

This should be fixed now

Richard

Richard Herveille

Managing Director

Phone +31 (45) 405 5681

Cell +31 (6) 5207 2230

@.***

On 12/10/2021, 09:41, "PauliusMorku" @.***> wrote:

You should find a definition `define FIXED_VERSION in RV12/rtl/verilog/core/riscv_id.sv. If you comment the definition out then it uses original buggy code and if you uncomment it then fixed code is used. The whole fix itself is contained in riscv_id.sv too, just follow the usage of this define within the file. Regards, Paulius

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

rherveille commented 2 years ago

Fixed in dev branch