chipsalliance / Cores-VeeR-EL2

VeeR EL2 Core
https://chipsalliance.github.io/Cores-VeeR-EL2/html/
Apache License 2.0
246 stars 74 forks source link

Simulation hits max cycle count for dhry, cmark_iccm, cmark_dccm #162

Closed GeorgeWu1204 closed 7 months ago

GeorgeWu1204 commented 7 months ago

Hello, When I try to simulate the design using existing benchmarks, I observed that benchmarks related to ICCM and DCCM consistently fail to pass the simulation. Notably, the console.log file yields no output in these instances and the simulation results says that the simulation hit max cycle. Could you provide insight into the potential causes of this issue? It's also worth mentioning that other benchmarks, such as "hello_world" and "cmark," successfully pass the simulation. Thank you so much. image

algrobman commented 7 months ago

check if the CPU does something useful and not stuck in exception - look exec*.log instructions execution trace ..

GeorgeWu1204 commented 7 months ago

check if the CPU does something useful and not stuck in exception - look exec*.log instructions execution trace ..

Thanks for the reply. I have checked the exec.log and noticed that all the instructions after 124 are zeros. Could you please give me some suggestions? Many thanks. image

algrobman commented 7 months ago

Your trace shows that CPU reads 0 to ra (return address) - (instruction #122 ) from stack (instead of 0x800006dc, written by #109) and then returns from a function to address 0 (instruction #124), where you don't have code - (zeros) , which are unimplemented opcode for the CPU. So the CPU takes exception and goes to address, set in mtvec CSR, which is also 0 ( mtvec is not set by the program) . Thus the CPU is stuck in the address 0, constantly taking exception.

BTW what are the start/end addresses of the DCCM?

GeorgeWu1204 commented 7 months ago

Your trace shows that CPU reads 0 to ra (return address) - (instruction #122 ) from stack (instead of 0x800006dc, written by #109) and then returns from a function to address 0 (instruction #124), where you don't have code - (zeros) , which are unimplemented opcode for the CPU. So the CPU takes exception and goes to address, set in mtvec CSR, which is also 0 ( mtvec is not set by the program) . Thus the CPU is stuck in the address 0, constantly taking exception.

BTW what are the start/end addresses of the DCCM?

Thank you so much for mentioning that, the start/end addresses of DCCM are from f0040000 to f0043e30.

Does that mean there are some problems related to the benchmark compilation? The command I run is make -f $RV_ROOT/tools/Makefile verilator TEST=dhry

The Verilator Version is v5.020 The riscv64-unknown-elf-gcc Version is 10.2.0

algrobman commented 7 months ago

you need to make sure that your program data /stack, and other data sections fit the physical size of the DCCM. it is not related to verilator version. But the way you build your design (how much memory you select)

GeorgeWu1204 commented 7 months ago

you need to make sure that your program data /stack, and other data sections fit the physical size of the DCCM. it is not related to verilator version. But the way you build your design (how much memory you select)

Thanks for the reply. The dccm_size is selected to be 512KB, the rest of the dccm settings are as shown, image The DCCM pre-load region is from f0040000 to f0043e30, which I believe is in the right region. The dhry.map is, image Unfortunately, the problem has not been resolved yet. May I ask how I can verify whether the data has been correctly loaded into f0040000? Or do you think there might be another way to debug this? Thank you so much for your help; it is greatly appreciated.

algrobman commented 7 months ago

Your screenshot shows 64 KB DCCM, which should be sufficient for Dhrystone . Looks like the CPU is designed with external DCCM/ICCM RAMs, instantiated in the tb_top, so run with waves and see if/why you get zero value from address 0xf0043cfc when instruction #122 is executed .

BTW, did you run the test out of the box without any modifications as README suggests?

GeorgeWu1204 commented 7 months ago

Your screenshot shows 64 KB DCCM, which should be sufficient for Dhrystone . Looks like the CPU is designed with external DCCM/ICCM RAMs, instantiated in the tb_top, so run with waves and see if/why you get zero value from address 0xf0043cfc when instruction #122 is executed .

BTW, did you run the test out of the box without any modifications as README suggests?

Yes, I did not modify anything; I simply ran the command make -f $RV_ROOT/tools/Makefile TEST=dhry. I will try to see the waves and thanks for the suggestions.

GeorgeWu1204 commented 7 months ago

Your screenshot shows 64 KB DCCM, which should be sufficient for Dhrystone . Looks like the CPU is designed with external DCCM/ICCM RAMs, instantiated in the tb_top, so run with waves and see if/why you get zero value from address 0xf0043cfc when instruction #122 is executed .

BTW, did you run the test out of the box without any modifications as README suggests?

image Sorry to interrupt again, but after continuing to track the potential error, I've noticed that the generated program.hex might be incomplete, as shown in the screenshot below. This could lead to the memory being zero for the address after #124. Could this issue be related to a problem with picolibc?

algrobman commented 7 months ago

Hi, I think they misconnected DCCM/ICCM RAMs when moved them from design to testbench. update testbench/tb_top.sv :

leave only these defines:

endtask

`define DRAM(bk) Gen_dccm_enable.dccm_loop[bk].ram.ram_core
`define IRAM(bk) Gen_iccm_enable.iccm_loop[bk].iccm_bank.ram_core

task slam_dccm_ram(input [31:0] addr, input[38:0] data);

and copy this stuff instead of original code

//////////////////////////////////////////////////////
// DCCM
//
if (pt.DCCM_ENABLE == 1) begin: Gen_dccm_enable
    `define EL2_LOCAL_DCCM_RAM_TEST_PORTS   .TEST1   (1'b0   ), \
                                            .RME     (1'b0   ), \
                                            .RM      (4'b0000), \
                                            .LS      (1'b0   ), \
                                            .DS      (1'b0   ), \
                                            .SD      (1'b0   ), \
                                            .TEST_RNM(1'b0   ), \
                                            .BC1     (1'b0   ), \
                                            .BC2     (1'b0   ), \

    localparam DCCM_INDEX_DEPTH = ((pt.DCCM_SIZE)*1024)/((pt.DCCM_BYTE_WIDTH)*(pt.DCCM_NUM_BANKS));  // Depth of memory bank
    // 8 Banks, 16KB each (2048 x 72)
    for (genvar i=0; i<pt.DCCM_NUM_BANKS; i++) begin: dccm_loop

            el2_ram #(DCCM_INDEX_DEPTH,39)  ram (
                                    // Primary ports
                                    .ME(el2_mem_export.dccm_clken[i]),
                                    .CLK(el2_mem_export.clk),
                                    .WE(el2_mem_export.dccm_wren_bank[i]),
                                    .ADR(el2_mem_export.dccm_addr_bank[i]),
                                    .D({el2_mem_export.dccm_wr_ecc_bank[i],el2_mem_export.dccm_wr_data_bank[i]} ),
                                    .Q({el2_mem_export.dccm_bank_ecc[i], el2_mem_export.dccm_bank_dout[i]}),
                                    .ROP ( ),
                                    // These are used by SoC
                                    `EL2_LOCAL_DCCM_RAM_TEST_PORTS
                                    .*
                                    );
    end : dccm_loop
end :Gen_dccm_enable

//////////////////////////////////////////////////////
// ICCM
//
if (pt.ICCM_ENABLE) begin : Gen_iccm_enable
for (genvar i=0; i<pt.ICCM_NUM_BANKS; i++) begin: iccm_loop
    el2_ram #(.depth(1<<pt.ICCM_INDEX_BITS), .width(39)) iccm_bank (
                                     // Primary ports
                                     .ME(el2_mem_export.iccm_clken[i]),
                                     .CLK(el2_mem_export.clk),
                                     .WE(el2_mem_export.iccm_wren_bank[i]),
                                     .ADR(el2_mem_export.iccm_addr_bank[i]),
                                     .D({el2_mem_export.iccm_bank_wr_ecc[i],el2_mem_export.iccm_bank_wr_data[i]}),
                                     .Q({el2_mem_export.iccm_bank_ecc[i], el2_mem_export.iccm_bank_dout[i]}),
                                     .ROP ( ),
                                     // These are used by SoC
                                     .TEST1    (1'b0   ),
                                     .RME      (1'b0   ),
                                     .RM       (4'b0000),
                                     .LS       (1'b0   ),
                                     .DS       (1'b0   ),
                                     .SD       (1'b0   ) ,
                                     .TEST_RNM (1'b0   ),
                                     .BC1      (1'b0   ),
                                     .BC2      (1'b0   )

                                      );

end : iccm_loop
end : Gen_iccm_enable
GeorgeWu1204 commented 7 months ago

Hi, I think they misconnected DCCM/ICCM RAMs when moved them from design to testbench. update testbench/tb_top.sv :

leave only these defines:

endtask

`define DRAM(bk) Gen_dccm_enable.dccm_loop[bk].ram.ram_core
`define IRAM(bk) Gen_iccm_enable.iccm_loop[bk].iccm_bank.ram_core

task slam_dccm_ram(input [31:0] addr, input[38:0] data);

and copy this stuff instead of original code

//////////////////////////////////////////////////////
// DCCM
//
if (pt.DCCM_ENABLE == 1) begin: Gen_dccm_enable
    `define EL2_LOCAL_DCCM_RAM_TEST_PORTS   .TEST1   (1'b0   ), \
                                            .RME     (1'b0   ), \
                                            .RM      (4'b0000), \
                                            .LS      (1'b0   ), \
                                            .DS      (1'b0   ), \
                                            .SD      (1'b0   ), \
                                            .TEST_RNM(1'b0   ), \
                                            .BC1     (1'b0   ), \
                                            .BC2     (1'b0   ), \

    localparam DCCM_INDEX_DEPTH = ((pt.DCCM_SIZE)*1024)/((pt.DCCM_BYTE_WIDTH)*(pt.DCCM_NUM_BANKS));  // Depth of memory bank
    // 8 Banks, 16KB each (2048 x 72)
    for (genvar i=0; i<pt.DCCM_NUM_BANKS; i++) begin: dccm_loop

            el2_ram #(DCCM_INDEX_DEPTH,39)  ram (
                                    // Primary ports
                                    .ME(el2_mem_export.dccm_clken[i]),
                                    .CLK(el2_mem_export.clk),
                                    .WE(el2_mem_export.dccm_wren_bank[i]),
                                    .ADR(el2_mem_export.dccm_addr_bank[i]),
                                    .D({el2_mem_export.dccm_wr_ecc_bank[i],el2_mem_export.dccm_wr_data_bank[i]} ),
                                    .Q({el2_mem_export.dccm_bank_ecc[i], el2_mem_export.dccm_bank_dout[i]}),
                                    .ROP ( ),
                                    // These are used by SoC
                                    `EL2_LOCAL_DCCM_RAM_TEST_PORTS
                                    .*
                                    );
    end : dccm_loop
end :Gen_dccm_enable

//////////////////////////////////////////////////////
// ICCM
//
if (pt.ICCM_ENABLE) begin : Gen_iccm_enable
for (genvar i=0; i<pt.ICCM_NUM_BANKS; i++) begin: iccm_loop
    el2_ram #(.depth(1<<pt.ICCM_INDEX_BITS), .width(39)) iccm_bank (
                                     // Primary ports
                                     .ME(el2_mem_export.iccm_clken[i]),
                                     .CLK(el2_mem_export.clk),
                                     .WE(el2_mem_export.iccm_wren_bank[i]),
                                     .ADR(el2_mem_export.iccm_addr_bank[i]),
                                     .D({el2_mem_export.iccm_bank_wr_ecc[i],el2_mem_export.iccm_bank_wr_data[i]}),
                                     .Q({el2_mem_export.iccm_bank_ecc[i], el2_mem_export.iccm_bank_dout[i]}),
                                     .ROP ( ),
                                     // These are used by SoC
                                     .TEST1    (1'b0   ),
                                     .RME      (1'b0   ),
                                     .RM       (4'b0000),
                                     .LS       (1'b0   ),
                                     .DS       (1'b0   ),
                                     .SD       (1'b0   ) ,
                                     .TEST_RNM (1'b0   ),
                                     .BC1      (1'b0   ),
                                     .BC2      (1'b0   )

                                      );

end : iccm_loop
end : Gen_iccm_enable

Solved! Thank you so much! I really appreciate your help : )