gem5 / gem5

The official repository for the gem5 computer-system architecture simulator.
http://www.gem5.org
BSD 3-Clause "New" or "Revised" License
1.68k stars 1.23k forks source link

RISC-V Compressed Branch Instruction Size Error cause BTB wrong update #1405

Open Yoga57894 opened 3 months ago

Yoga57894 commented 3 months ago

Describe the bug

I noticed that when running a RISC-V CoreMark on Gem5 24.0.0 with an O3CPU, the BTB hit rate decreases. I think that this issue occurs because when a compressed branch instruction (e.g., c_bnez) accesses the execute() function at iew.cc:1229, the pcstate._compressed changes. As a result, the BTB assumes the instruction size is 4 and considers the branch as actually taken, which changes the BTB behavior (npc() != pc() + size()).

Affects version V24.0.0.0

gem5 Modifications No Modification

To Reproduce

scons build/RISCV/gem5.opt

build/RISCV/gem5.opt --debug-flags=Decode,Branch,Tage configs/deprecated/example/se.py --cpu-type=O3CPU --cmd=coremark --bp-type=TAGE --caches --mem-type=SimpleMemory 

Terminal Output

12205748: system.cpu.branchPred: Lookup branch: dee; predict:1
12205748: system.cpu.branchPred: [tid:0, sn:111357] Branch predictor predicted 1 for PC:0xdee DirectCond
12205748: system.cpu.branchPred: [tid:0, sn:111357] PC:0xdee BTB:hit
12205748: system.cpu.branchPred: predict(tid:0, sn:111357, PC:0xdee, DirectCond) -> taken:1, target:(0xddc=>0xde0).(0=>1)  provider:BTB 
12205748: system.cpu.branchPred.tage: Updating global histories with branch:dee; taken?:1, path Hist: fff
...
12210443: system.cpu.branchPred: [tid:0] Squash from commit start from sequence number 111357, setting target to (0xdf0=>0xdf4).(0=>1)
12210443: system.cpu.branchPred: [tid:0] [squash sn:111455] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111455, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 18
12210443: system.cpu.branchPred: [tid:0] [squash sn:111448] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111448, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 17
12210443: system.cpu.branchPred: [tid:0] [squash sn:111441] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111441, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 16
12210443: system.cpu.branchPred: [tid:0] [squash sn:111434] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111434, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 15
12210443: system.cpu.branchPred: [tid:0] [squash sn:111427] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111427, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 14
12210443: system.cpu.branchPred: [tid:0] [squash sn:111420] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111420, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 13
12210443: system.cpu.branchPred: [tid:0] [squash sn:111413] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111413, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 12
12210443: system.cpu.branchPred: [tid:0] [squash sn:111406] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111406, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 11
12210443: system.cpu.branchPred: [tid:0] [squash sn:111399] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111399, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 10
12210443: system.cpu.branchPred: [tid:0] [squash sn:111392] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111392, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 9
12210443: system.cpu.branchPred: [tid:0] [squash sn:111385] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111385, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 8
12210443: system.cpu.branchPred: [tid:0] [squash sn:111378] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111378, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 7
12210443: system.cpu.branchPred: [tid:0] [squash sn:111371] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111371, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 6
12210443: system.cpu.branchPred: [tid:0] [squash sn:111364] Incorrect: DirectCond
12210443: system.cpu.branchPred: Deleting branch info: dee
12210443: system.cpu.branchPred: [tid:0, squash sn:111357] Removing history for sn:111364, PC:0xdee
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] pred_hist.size(): 5
12210443: system.cpu.branchPred: [tid:0] [squash sn:111357] Mispredicted: DirectCond, PC:0xdee
12210443: system.cpu.branchPred.tage: Restoring branch info: dee; taken? 1; PathHistory:ffff, pointer:2085169
12210443: system.cpu.branchPred: [tid:0] BTB Update called for [sn:111357] PC 0xdee -> T: 0xdf0

(12205748 Predict taken, actually should be non-taken, mispredict cause squash , 12210443 said predict taken & BTB Update (0xdee -> target 0xdf0 is non-taken and non-taken will not update BTB)

hnpl commented 3 months ago

That's interesting.

Can you add the ExecAll flag to --debug-flag to see which instruction caused that?

Yoga57894 commented 3 months ago

Hi hnpl,
I add ExecAll and it shows that 0xdee is c_bnez, here are the detail:

12205748: system.cpu: A0 T0 : 0xde0 @matrix_test+56    : lh a5, 0(a4)               : MemRead :  D=0x0000000000000092 A=0x1032ac  FetchSeq=111352  CPSeq=49830  flags=(IsInteger|IsLoad)
12205748: system.cpu: A0 T0 : 0xde4 @matrix_test+60    : c_add a5, s2               : IntAlu :  D=0x00000000000000a3  FetchSeq=111353  CPSeq=49831  flags=(IsInteger)
12205748: system.cpu: A0 T0 : 0xde6 @matrix_test+62    : sh a5, 0(a4)               : MemWrite :  D=0x00000000000000a3 A=0x1032ac  FetchSeq=111354  CPSeq=49832  flags=(IsInteger|IsStore)
12205748: system.cpu: A0 T0 : 0xdea @matrix_test+66    : c_addi a3, -1              : IntAlu :  D=0x0000000000000000  FetchSeq=111355  CPSeq=49833  flags=(IsInteger)
12205748: system.cpu: A0 T0 : 0xdec @matrix_test+68    : c_addiw a2, 1              : IntAlu :  D=0x0000000000000009  FetchSeq=111356  CPSeq=49834  flags=(IsInteger)

**12205748: system.cpu: A0 T0 : 0xdee @matrix_test+70    : c_bnez a3, -18             : IntAlu :   FetchSeq=111357  CPSeq=49835  flags=(IsInteger|IsControl|IsDirectControl|IsCondControl)**

12210756: system.cpu: A0 T0 : 0xdf0 @matrix_test+72    : c_addiw a1, 1              : IntAlu :  D=0x0000000000000001  FetchSeq=111456  CPSeq=49836  flags=(IsInteger)
12210756: system.cpu: A0 T0 : 0xdf2 @matrix_test+74    : c_add a0, s4               : IntAlu :  D=0x0000000000000009  FetchSeq=111457  CPSeq=49837  flags=(IsInteger)
12210756: system.cpu: A0 T0 : 0xdf4 @matrix_test+76    : bne a1, s4, -28            : IntAlu :   FetchSeq=111458  CPSeq=49838  flags=(IsInteger|IsControl|IsDirectControl|IsCondControl)
12214512: system.cpu: A0 T0 : 0xdd8 @matrix_test+48    : c_mv a2, a0                : IntAlu :  D=0x0000000000000009  FetchSeq=111463  CPSeq=49839  flags=(IsInteger)
12214512: system.cpu: A0 T0 : 0xdda @matrix_test+50    : c_mv a3, s5                : IntAlu :  D=0x0000000000000009  FetchSeq=111464  CPSeq=49840  flags=(IsInteger)