Open Yoga57894 opened 3 months ago
That's interesting.
Can you add the ExecAll flag to --debug-flag to see which instruction caused that?
Hi hnpl,
I add ExecAll and it shows that 0xdee is c_bnez,
here are the detail:
12205748: system.cpu: A0 T0 : 0xde0 @matrix_test+56 : lh a5, 0(a4) : MemRead : D=0x0000000000000092 A=0x1032ac FetchSeq=111352 CPSeq=49830 flags=(IsInteger|IsLoad)
12205748: system.cpu: A0 T0 : 0xde4 @matrix_test+60 : c_add a5, s2 : IntAlu : D=0x00000000000000a3 FetchSeq=111353 CPSeq=49831 flags=(IsInteger)
12205748: system.cpu: A0 T0 : 0xde6 @matrix_test+62 : sh a5, 0(a4) : MemWrite : D=0x00000000000000a3 A=0x1032ac FetchSeq=111354 CPSeq=49832 flags=(IsInteger|IsStore)
12205748: system.cpu: A0 T0 : 0xdea @matrix_test+66 : c_addi a3, -1 : IntAlu : D=0x0000000000000000 FetchSeq=111355 CPSeq=49833 flags=(IsInteger)
12205748: system.cpu: A0 T0 : 0xdec @matrix_test+68 : c_addiw a2, 1 : IntAlu : D=0x0000000000000009 FetchSeq=111356 CPSeq=49834 flags=(IsInteger)
**12205748: system.cpu: A0 T0 : 0xdee @matrix_test+70 : c_bnez a3, -18 : IntAlu : FetchSeq=111357 CPSeq=49835 flags=(IsInteger|IsControl|IsDirectControl|IsCondControl)**
12210756: system.cpu: A0 T0 : 0xdf0 @matrix_test+72 : c_addiw a1, 1 : IntAlu : D=0x0000000000000001 FetchSeq=111456 CPSeq=49836 flags=(IsInteger)
12210756: system.cpu: A0 T0 : 0xdf2 @matrix_test+74 : c_add a0, s4 : IntAlu : D=0x0000000000000009 FetchSeq=111457 CPSeq=49837 flags=(IsInteger)
12210756: system.cpu: A0 T0 : 0xdf4 @matrix_test+76 : bne a1, s4, -28 : IntAlu : FetchSeq=111458 CPSeq=49838 flags=(IsInteger|IsControl|IsDirectControl|IsCondControl)
12214512: system.cpu: A0 T0 : 0xdd8 @matrix_test+48 : c_mv a2, a0 : IntAlu : D=0x0000000000000009 FetchSeq=111463 CPSeq=49839 flags=(IsInteger)
12214512: system.cpu: A0 T0 : 0xdda @matrix_test+50 : c_mv a3, s5 : IntAlu : D=0x0000000000000009 FetchSeq=111464 CPSeq=49840 flags=(IsInteger)
Describe the bug
I noticed that when running a RISC-V CoreMark on Gem5 24.0.0 with an O3CPU, the BTB hit rate decreases. I think that this issue occurs because when a compressed branch instruction (e.g., c_bnez) accesses the execute() function at iew.cc:1229, the pcstate._compressed changes. As a result, the BTB assumes the instruction size is 4 and considers the branch as actually taken, which changes the BTB behavior (npc() != pc() + size()).
Affects version V24.0.0.0
gem5 Modifications No Modification
To Reproduce
Terminal Output
(12205748 Predict taken, actually should be non-taken, mispredict cause squash , 12210443 said predict taken & BTB Update (0xdee -> target 0xdf0 is non-taken and non-taken will not update BTB)