Adds native bootrom to the cluster instead of fetching from externally. Also adds additional scratch registers to the peripherals which can be used to write the entry point of the binary.
The current bootrom is very simple currently, and just fetches from the start of the DRAM region:
_start:
lui t0, %hi(_l3_base) // Load the L3 base address
addi t0, t0, %lo(_l3_base)
jalr zero, t0, 0 // Jump to the L3 base address
The desired bootrom would look more like this:
_snitch_park:
# Set trap vector
la t0, _snitch_resume
csrw mtvec, t0
# Enable software and cluster interrupts
csrsi mstatus, MSTATUS_MIE # CSR set (uimm)
lui t0, 0x80 # (1 << 19) cluster interrupts
addi t0, t0, 8 # (1 << 3) software interrupts
csrw mie, t0
wfi
_snitch_resume:
auipc t0, 0
li t1, -3736 # Jump to scratch register 1 at offset 0x188
add t0, t0, t1
lw t0, 0(t0)
jalr ra, 0(t0)
j _snitch_park
where the cores are parked, and after getting an interrupt they start fetching from one of the scratch register (which is written by a host core e.g.).
This was also my first approach, but getting this to run in the testbench was not trivial to do (at that time). It requires an external interrupt and writing the entrypoint to one of the configuration registers, which is possible with an AXI driver connected to the narrow AXI interface. However, verilator4 does not support timing constructs, so AXI drivers cannot be used at all, which is why I reverted back to a very simple bootrom.
Now, we upgraded to verilator 5, which would theoretically allow such a thing. The question now is if this is needed, or if the intended use case is that the user will anyway write his custom bootrom that will overwrite the default one. Afaik, this is also the approach that we did for previous tapeouts (i.e. Occamy).
Adds native bootrom to the cluster instead of fetching from externally. Also adds additional scratch registers to the peripherals which can be used to write the entry point of the binary.
The current bootrom is very simple currently, and just fetches from the start of the
DRAM
region:The desired bootrom would look more like this:
where the cores are parked, and after getting an interrupt they start fetching from one of the scratch register (which is written by a host core e.g.).
This was also my first approach, but getting this to run in the testbench was not trivial to do (at that time). It requires an external interrupt and writing the entrypoint to one of the configuration registers, which is possible with an AXI driver connected to the narrow AXI interface. However, verilator4 does not support timing constructs, so AXI drivers cannot be used at all, which is why I reverted back to a very simple bootrom.
Now, we upgraded to verilator 5, which would theoretically allow such a thing. The question now is if this is needed, or if the intended use case is that the user will anyway write his custom bootrom that will overwrite the default one. Afaik, this is also the approach that we did for previous tapeouts (i.e. Occamy).
TODO
rtl
target and prerequisites