openhwgroup / cva6

The CORE-V CVA6 is an Application class 6-stage RISC-V CPU capable of booting Linux
https://docs.openhwgroup.org/projects/cva6-user-manual/
Other
2.2k stars 670 forks source link

Core does not boot and debugger does not work when adding an AXI master on the bus #1941

Closed Juan-Gg closed 5 months ago

Juan-Gg commented 5 months ago

I am trying to connect a custom accelerator to the cva6 axi bus. This accelerator will have an AXI slave port for configuration and an AXI master port for reading and writing data directly off main memory. I am now attempting to attach a simple AXI master to the bus, that just writes to a fixed memory address so I can later check that address’ contents with gdb. The accelerator code is as follows:

AXI master code: ``` verilog // A test to integrate an AXI master in the CVA6 APU // See axi_mem_if/src/axi2mem.sv for example use of AXI_BUS interface (as a slave that is) module axi_master_test #( parameter int unsigned AXI_ID_WIDTH, parameter int unsigned AXI_ADDR_WIDTH, parameter int unsigned AXI_DATA_WIDTH, parameter int unsigned AXI_USER_WIDTH, parameter logic [31:0] ADDRESS, parameter logic [31:0] DATA ) ( input logic clk_i, input logic rst_ni, AXI_BUS.Master axi_master_port ); // Default values. See "Table A10-1 Master interface write channel signals and default signal values" // // AXI write address channel assign axi_master_port.aw_id = '0; // assign axi_master_port.aw_addr; assign axi_master_port.aw_len = '0; // Number of beats in burst - 1 assign axi_master_port.aw_size = 3'b010; // Number of bytes per beat, 3'b011 for 8 bytes -- Let 4 bytes for 32b access assign axi_master_port.aw_burst = 0'b00; // Burst type FIXED (0'b00) assign axi_master_port.aw_lock = '0; assign axi_master_port.aw_cache = '0; assign axi_master_port.aw_prot = 3'b0; // Unpriviledged access assign axi_master_port.aw_qos = '0; assign axi_master_port.aw_region = '0; assign axi_master_port.aw_atop = '0; // Configures atomic operations (AXI 5) assign axi_master_port.aw_user = '0; // assign axi_master_port.aw_valid; // assign axi_master_port.aw_ready; // Input // // AXI write data channel // assign axi_master_port.w_data; assign axi_master_port.w_strb = '1; // Strobe, byte enable assign axi_master_port.w_last = 1'b1; // Single beat assign axi_master_port.w_user = '0; // assign axi_master_port.w_valid; // assign axi_master_port.w_ready; // Input // // AXI write response channel // assign axi_master_port.b_id; // Input // assign axi_master_port.b_resp; // Input // assign axi_master_port.b_user; // Input // assign axi_master_port.b_valid; // Input assign axi_master_port.b_ready = 1'b1; // No error checking // // AXI read address channel assign axi_master_port.ar_id = '0; // assign axi_master_port.ar_addr; assign axi_master_port.ar_len = '0; // Number of beats in burst - 1 assign axi_master_port.ar_size = 3'b010; // Number of bytes per beat, let 4 bytes for 32b access assign axi_master_port.ar_burst = 2'b00; // Burst type FIXED assign axi_master_port.ar_lock = '0; assign axi_master_port.ar_cache = '0; assign axi_master_port.ar_prot = 3'b0; // Unpriviledged access assign axi_master_port.ar_qos = '0; assign axi_master_port.ar_region = '0; assign axi_master_port.ar_user = '0; // assign axi_master_port.ar_valid; // assign axi_master_port.ar_ready; // Input // // AXI read data channel // assign axi_master_port.r_id; // Input // assign axi_master_port.r_data; // Input // assign axi_master_port.r_resp; // Input // assign axi_master_port.r_last; // Input // assign axi_master_port.r_user; // Input // assign axi_master_port.r_valid; // Input // assign axi_master_port.r_ready; // ----------------------------------- assign axi_master_port.w_data = DATA; assign axi_master_port.aw_addr = ADDRESS; logic [9:0] timer_aw_q, timer_aw_d; logic [9:0] timer_w_q, timer_w_d; always_ff @(posedge clk_i) begin if(!rst_ni) begin timer_w_q <= 2; timer_aw_q <= 2; end else begin timer_w_q <= timer_w_d; timer_aw_q <= timer_aw_d; end end always_comb begin // Defaults axi_master_port.aw_valid = 1'b0; timer_w_d = timer_w_q + 1; axi_master_port.w_valid = 1'b0; timer_aw_d = timer_aw_q + 1; // Write address if (timer_aw_q == 0) begin // Wait for transaction axi_master_port.aw_valid = 1'b1; if(!axi_master_port.aw_ready) // Wait for ready timer_aw_d = timer_aw_q; end // Write data if (timer_w_q == 0) begin // Wait for transaction axi_master_port.w_valid = 1'b1; if(!axi_master_port.w_ready) // Wait for ready timer_w_d = timer_w_q; end end endmodule ```

I simulated this using an AXI crossbar with the same configuration and memory map as in cva6, using an axi2mem module and a simulated RAM as a stand-in for DRAM. I had to simulate with --no-timing in verilator, otherwise I got a nice segmentation fault. I’m instantiating the accelerator in ariane_xilinx.sv as follows:

axi_master_test #(
  .AXI_ID_WIDTH   ( AxiIdWidthSlaves ),
  .AXI_ADDR_WIDTH ( AxiAddrWidth    ),
  .AXI_DATA_WIDTH ( AxiDataWidth    ),
  .AXI_USER_WIDTH ( AxiUserWidth    ),
  .ADDRESS('h9000_0000),
  .DATA('hABCD)
) axi_master_test_i (
  .clk_i  (clk  ),
  .rst_ni (ndmreset_n ),
  .axi_master_port (slave[2]) // Slave port in xbar
);

The only other changes I made were:

After the aforementioned changes, the core does no longer boot (i.e., no UART activity, whereas before printed “Hello World! init SPI …”). Running OpenOCD results in the following errors:

Info : clock speed 1000 kHz
Info : JTAG tap: riscv.cpu tap/device found: 0x43651093 (mfg: 0x049 (Xilinx), part: 0x3651, ver: 0x4)
Info : datacount=2 progbufsize=8
Error: unable to halt hart 0
Error:   dmcontrol=0x80000001
Error:   dmstatus =0x00000c82
Error: Fatal: Hart 0 failed to halt during examine()
Warn : target riscv.cpu examination failed
Info : starting gdb server for riscv.cpu on 3333
Info : Listening on port 3333 for gdb connections
init routine started
Error: Target not examined yet

Note: I am working with a kc705 board, and debugging as I describe in #1803, using JTAG through Xilinx’s BSCANE2. This should not have any influence on the issue at hand. I would appreciate any ideas as to what I may be doing wrong. Maybe I am preventing the CPU from using the bus somehow? I tried a couple of things, such as using ndmreset_n instead of rst_n as a reset signal (the former seems to be generated from the latter, some modules use one, some use the other…). Synthesis takes over an hour on my computer.

Note2: I just noticed that the CPU hangs when my AXI master reads from RAM, reading from other addresses seems to work fine.

Thanks.

jquevremont commented 5 months ago

I have not identified anyone in the team who can help in a short term. I suggest to leave the issue open for one week if anyone around has a few ideas.

Juan-Gg commented 5 months ago

It seems that I need to "re-generate the MIG to accommodate for the wider id filed" as mentioned in #568. The warning about this that they intended to put on the source code was never written.

I will try this. If it works I'll close the issue and make a pull request to add said warning.

Edit: It did work. Just follow the instructions here and run the run.tcl of the modified xilinx IPs, e.g., in my case, after changing the ID width:

$ export XILINX_PART=xc7k325tffg900-2
$ export XILINX_BOARD=xilinx.com:kc705:part0:1.5
$ export BOARD=kc705
$ export CLK_PERIOD_NS=20
$ cd corev_apu/fpga/xilinx/xlnx_mig_7_ddr3/
$ vivado -mode batch -source tcl/run.tcl
$ cd ../xlnx_axi_clock_converter/
$ vivado -mode batch -source tcl/run.tcl
$ cd ../xlnx_axi_dwidth_converter
$ vivado -mode batch -source tcl/run.tcl
$ cd ../xlnx_axi_dwidth_converter_dm_master/
$ vivado -mode batch -source tcl/run.tcl
$ cd ../xlnx_axi_dwidth_converter_dm_slave/
$ vivado -mode batch -source tcl/run.tcl

I made the pull request to add the warning.