chipsalliance / Cores-VeeR-EH1

VeeR EH1 core
Apache License 2.0
822 stars 221 forks source link

GHR refresh #107

Open Zissi-Lei opened 2 years ago

Zissi-Lei commented 2 years ago

Hi, in the file "ifu_bp_ctl.sv", there is a GHR shift logic at line 1032: assign merged_ghr[RV_BHT_GHR_RANGE] = ( ({RV_BHT_GHR_SIZE{num_valids[3:0] >= 4'h4}} & {RV_BHT_GHR_PAD, final_h }) | // 000H ({RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h3}} & {RV_BHT_GHR_PAD2, final_h}) | // P00H ifdef RV_BHT_GHR_SIZE_2 ({RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h2}} & { 1'b0, final_h}) | // PP0H else ({RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h2}} & {fghr[RV_BHT_GHR_SIZE-3:0], 1'b0, final_h}) | // PP0H endif ({RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h1}} & {fghr[RV_BHT_GHR_SIZE-2:0], final_h}) | // PPPH ({RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h0}} & {fghr[`RV_BHT_GHR_RANGE]}) ); // PPPP I see that when num_valids[3:0] ≤ 4'h2, you just shift the GHR left without retaining the MSBs. But when num_valids[3:0] ≥ 4'h3, you choose to retain the MSBs of GHR, rather than just to shift it left like before. Is there another considerations about this policy? I'm very confused about this logic, thanks for your time!

aprnath commented 2 years ago

From the designer:

Part 1: “_I see that when numvalids[3:0] ≤ 4'h2, you just shift the GHR left without retaining the MSBs

This is only true for small BHTs that don’t have more bits in the GHR. The code shows this in the conditional:

    assign merged_ghr[`RV_BHT_GHR_RANGE] = ( ({`RV_BHT_GHR_SIZE{num_valids[3:0] >= 4'h4}} & {`RV_BHT_GHR_PAD,  final_h }) | // 000H
                                  ({`RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h3}} & {`RV_BHT_GHR_PAD2, final_h}) | // P00H
`ifdef RV_BHT_GHR_SIZE_2                                
                                  ({`RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h2}} & {                            1'b0, final_h}) | // PP0H
`else
                                  ({`RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h2}} & {fghr[`RV_BHT_GHR_SIZE-3:0], 1'b0, final_h}) | // PP0H
`endif
                                  ({`RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h1}} & {fghr[`RV_BHT_GHR_SIZE-2:0], final_h}) | // PPPH
                                  ({`RV_BHT_GHR_SIZE{num_valids[3:0] == 4'h0}} & {fghr[`RV_BHT_GHR_RANGE]}) ); // PPPP

Also, for num_valids < 2, we clearly have the fghr upper bits.

Part 2: “_But when numvalids[3:0] ≥ 4'h3, you choose to retain the MSBs of GHR, rather than just to shift it left like before

This is better for our benchmarks and comes down to the accuracy of the predictor when there are many valid branches in the fetch group. If you would prefer to do a full shift, you can modify the RV_BHT_GHR_PAD(2) defines.

(The likelihood of predicting 3 or more branches correctly is low (.85^3), so we preserve the upper bits. In practice it doesn’t really matter since we copy the EXU true GHR when we mispredict.)

Hope this helps.