SpinalHDL / NaxRiscv

MIT License
246 stars 38 forks source link

how to run a nax_core with a AXI4 interface #81

Open duanjiulon opened 5 months ago

duanjiulon commented 5 months ago

1.Hello, prior to this, I successfully ran a nax_soc using lite according to your tutorial, but its nax_core core interface is AXI4_lite. I want to use AXI, but I entered axi in the -- bus stand parameter, but it still doesn't work. The generated core interface is still axi4_lite. How can I generate a core for the axi4 interface to connect to the SOC where Briey is located, and replace the vex with nax? 2.

jlduan@dev-optiplex3060:~/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/ext/NaxRiscv$ sbt "runMain naxriscv.platform.LitexGen --with-jtag-tap --with-debug --scala-args=rvc=true --scala-file=/home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/configs/gen.scala --scala-args=rvc=true" [info] welcome to sbt 1.6.0 (Debian Java 17.0.9) [info] loading settings for project naxriscv-build from plugins.sbt ... [info] loading project definition from /home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/ext/NaxRiscv/project [info] loading settings for project root from build.sbt ... [info] loading settings for project spinalhdl-build from plugin.sbt ... [info] loading project definition from /home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/ext/NaxRiscv/ext/SpinalHDL/project [info] loading settings for project all from build.sbt ... [info] set current project to NaxRiscv (in build file:/home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/ext/NaxRiscv/) [info] running (fork) naxriscv.platform.LitexGen --with-jtag-tap --with-debug --scala-args=rvc=true --scala-file=/home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/configs/gen.scala --scala-args=rvc=true [info] [Runtime] SpinalHDL dev git head : 700e223f5dc8c2258117fae140ab8d6eb8446d97 [info] [Runtime] JVM max memory : 3956.0MiB [info] [Runtime] Current date : 2024.02.23 14:49:17 [info] [Progress] at 0.000 : Elaborate components [info] memoryRegions: Seq[naxriscv.platform.LitexMemoryRegion] = ArrayBuffer() [info] import scala.collection.mutable.ArrayBuffer [info] import naxriscv.utilities.Plugin [info] import naxriscv.platform.LitexMemoryRegion [info] import spinal.lib.bus.misc.SizeMapping [info] plugins: scala.collection.mutable.ArrayBuffer[naxriscv.utilities.Plugin] = ArrayBuffer(DocPlugin, MmuPlugin, FetchPlugin, PcPlugin, FetchCachePlugin, AlignerPlugin, FrontendPlugin, DecompressorPlugin, DecoderPlugin, integer_RfTranslationPlugin, RfDependencyPlugin, integer_RfAllocationPlugin, DispatchPlugin, BranchContextPlugin, HistoryPlugin, DecoderPredictionPlugin, BtbPlugin, GSharePlugin, Lsu2Plugin, DataCachePlugin, RobPlugin, CommitPlugin, integer_RegFilePlugin, CommitDebugFilterPlugin, CsrRamPlugin, PrivilegedPlugin, PerformanceCounterPlugin, ALU0_ExecutionUnitBase, ALU0_IntFormatPlugin, ALU0_SrcPlugin, ALU0_IntAluPlugin, AL...The main thread is stuck at : [info] java.base@17.0.9/jdk.internal.misc.Unsafe.park(Native Method) [info] java.base@17.0.9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341) [info] java.base@17.0.9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:506) [info] java.base@17.0.9/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465) [info] java.base@17.0.9/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436) [info] java.base@17.0.9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1623) [info] java.base@17.0.9/java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:236) [info] java.base@17.0.9/java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:364) [info] app//spinal.sim.JvmThread.park(SimManager.scala:30) [info] app//naxriscv.platform.NaxRiscvLitex$$anon$1.(Litex.scala:48) [info] app//naxriscv.platform.NaxRiscvLitex.(Litex.scala:47) [info] app//naxriscv.platform.LitexGen$$anonfun$14.apply(Litex.scala:184) [info] app//naxriscv.platform.LitexGen$$anonfun$14.apply(Litex.scala:157) [info] app//spinal.sim.JvmThread.run(SimManager.scala:51) [info] ** [info] [Warning] Elaboration failed (0 error). [info] Spinal will restart with scala trace to help you to find the problem. [info] ** [info] [Progress] at 4.306 : Elaborate components [info] memoryRegions: Seq[naxriscv.platform.LitexMemoryRegion] = ArrayBuffer() [info] import scala.collection.mutable.ArrayBuffer [info] import naxriscv.utilities.Plugin [info] import naxriscv.platform.LitexMemoryRegion [info] import spinal.lib.bus.misc.SizeMapping [info] plugins: scala.collection.mutable.ArrayBuffer[naxriscv.utilities.Plugin] = ArrayBuffer(DocPlugin, MmuPlugin, FetchPlugin, PcPlugin, FetchCachePlugin, AlignerPlugin, FrontendPlugin, DecompressorPlugin, DecoderPlugin, integer_RfTranslationPlugin, RfDependencyPlugin, integer_RfAllocationPlugin, DispatchPlugin, BranchContextPlugin, HistoryPlugin, DecoderPredictionPlugin, BtbPlugin, GSharePlugin, Lsu2Plugin, DataCachePlugin, RobPlugin, CommitPlugin, integer_RegFilePlugin, CommitDebugFilterPlugin, CsrRamPlugin, PrivilegedPlugin, PerformanceCounterPlugin, ALU0_ExecutionUnitBase, ALU0_IntFormatPlugin, ALU0_SrcPlugin, ALU0_IntAluPlugin, AL...The main thread is stuck at : [info] java.base@17.0.9/jdk.internal.misc.Unsafe.park(Native Method) [info] java.base@17.0.9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341) [info] java.base@17.0.9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:506) [info] java.base@17.0.9/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465) [info] java.base@17.0.9/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436) [info] java.base@17.0.9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1623) [info] java.base@17.0.9/java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:236) [info] java.base@17.0.9/java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:364) [info] app//spinal.sim.JvmThread.park(SimManager.scala:30) [info] app//naxriscv.platform.NaxRiscvLitex$$anon$1.(Litex.scala:48) [info] app//naxriscv.platform.NaxRiscvLitex.(Litex.scala:47) [info] app//naxriscv.platform.LitexGen$$anonfun$14.apply(Litex.scala:184) [info] app//naxriscv.platform.LitexGen$$anonfun$14.apply(Litex.scala:157) [info] app//spinal.sim.JvmThread.run(SimManager.scala:51) [error] Exception in thread "main" java.lang.AssertionError: assertion failed [error] at scala.Predef$.assert(Predef.scala:156) [error] at spinal.core.package$.assert(core.scala:497) [error] at naxriscv.lsu.DataMemBus$$anon$13.(DataCache.scala:408) [error] at naxriscv.lsu.DataMemBus.toAxi4(DataCache.scala:407) [error] at naxriscv.lsu.DataCacheAxi4$$anonfun$1$$anon$1.(DataCacheAxi4.scala:16) [error] at naxriscv.lsu.DataCacheAxi4$$anonfun$1.apply(DataCacheAxi4.scala:13) [error] at naxriscv.lsu.DataCacheAxi4$$anonfun$1.apply(DataCacheAxi4.scala:13) [error] at naxriscv.utilities.Plugin$$anon$1$$anonfun$late$1$$anonfun$5.apply(Framework.scala:58) [error] at spinal.core.Area$class.rework(Area.scala:59) [error] at naxriscv.utilities.Framework.rework(Framework.scala:81) [error] at naxriscv.utilities.Plugin$$anon$1$$anonfun$late$1.apply(Framework.scala:56) [error] at spinal.core.fiber.package$$anonfun$1.apply$mcV$sp(package.scala:16) [error] at spinal.core.fiber.AsyncThread$$anonfun$1.apply$mcV$sp(AsyncThread.scala:59) [error] at spinal.core.fiber.EngineContext$$anonfun$newJvmThread$1.apply$mcV$sp(AsyncCtrl.scala:39) [error] at spinal.sim.JvmThread.run(SimManager.scala:51) [error] Nonzero exit code returned from runner: 1 [error] (Compile / runMain) Nonzero exit code returned from runner: 1 [error] Total time: 9 s, completed 2024Äê2ÔÂ23ÈÕ ÏÂÎç2:49:22

Based on your previous answers to classmates' questions, I have added a specific gen.scala file, but this error still occurs. How can I resolve this error?

duanjiulon commented 5 months ago

for example,CMD runMain naxriscv.platform.LitexGen --netlist-name MyNaxRiscvLitex --netlist-directory=/home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog --reset-vector=0 --xlen=32 --memory-region=0,65536,rw,p --with-jtag-tap --with-debug --scala-file=/home/jlduan/work1/pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/configs/gen.scala is also the problem, please help me ,thanks

Dolu1990 commented 5 months ago

Hi,

but its nax_core core interface is AXI4_lite

You mean its pbus is AXI4_lite right ? mbus is ok right ?

How can I generate a core for the axi4 interface to connect to the SOC where Briey is located, and replace the vex with nax?

That was never done before. so, for single core without memory coherency right ? just AXI4 ?

I would say, the best there would to not use the litex related port, but instead going with raw NaxRiscv and implementing the bridge from the native NaxRiscv interfaces.

duanjiulon commented 5 months ago

I would say, the best there would to not use the litex related port, but instead going with raw NaxRiscv and implementing the bridge from the native NaxRiscv interfaces.

Hello, according to your prompt, I have recently successfully replaced all keywords related to Axi4lite in the NaxSoc.scala file with Axi4., And using the command in your example, I generated a Nax_core for the Axi4 interface of pBus, which I plan to use for DDR3 read and write of AXI4

  1. Is there anything I need to pay attention to? Is this core capable of performing this function? Is the pBus-AXI4 interface of the core now connected to dBus?
  2. Automatically generated commands through Litex: CD/home/jlduan/work1/pythandata CPU naxriscv/pythondata_cpu naxriscv/verilog/ext/NaxRiscv&&sbt "RunMain naxriscv. platform. site. NaxGen -- netlist name=NaxRiscv_cpu -- netlist directory=/home/jlduan/work1/pythontata cpu naxriscv/pythonadatacpu naxriscv/verilog -- reset vector=0-- xlen=32-- CPU count=1-- litedram width=32-- memory region=21474836482147483648, io, p -- memory region=032708, rxc, p --" memory region=26843545632708, rwxc, p -- memory region=402653184065536, rw, p -- with jtag tap -- with de Bug -- scala file=/home/jlduan/work1/pythandata cpu naxriscv/pythondata_cpu naxriscv/verilog/configurations/gen. scala -- scala args=rvc=true“ I successfully generated the V file, but I don't quite understand the roles played by the four memory regions. In other words, I don't know what their purpose is. thanks for your reply!best wishes!
Dolu1990 commented 4 months ago

So, the pBus is only used by the core for IO accesses (slow, in-order, word based accesses) While the mBus is used for all the main memory accesses (fast, always 64 bytes, out of order) So you only need to connect the DDR to the mBus.

The memory region mainly specify

Here is some example :

--memory-region=2147483648,131072,io,p => Allowed to access address [2147483648,2147483648+131072] on pBus for io access only (meaning slow, in order, load+store, can't fetch instruction from there) --memory-region=0,131072,rxc,p => address [0, 0 + 131072] are accessible via load(r) and fetch(x),no store, on pBus(p) and is cacheable(c) --memory-region=2147483648,8192,rw,p => can do load store, not cacheable, on pBus --memory-region=1073741824,536870912,rwxc,m => can do load store fetch, cacheable, on mBus

There seems like the rw flag aren't looked by NaxRiscv generator, what matthers are "io", "x", "p,"m","c".

See

case class LitexMemoryRegion(mapping : SizeMapping, mode : String, bus : String){
  def isIo = mode.contains("i") || mode.contains("o")
  def isExecutable = mode.contains("x")
  def isCachable = mode.contains("c")
  def onPeripheral = bus match {
    case "m" => false
    case "p" => true
  }
  def onMemory = !onPeripheral
}
duanjiulon commented 4 months ago

--memory-region=2147483648,131072,io,p => Allowed to access address [2147483648,2147483648+131072] on pBus for io access only (meaning slow, in order, load+store, can't fetch instruction from there) --memory-region=0,131072,rxc,p => address [0, 0 + 131072] are accessible via load(r) and fetch(x),no store, on pBus(p) and is cacheable(c) --memory-region=2147483648,8192,rw,p => can do load store, not cacheable, on pBus --memory-region=1073741824,536870912,rwxc,m => can do load store fetch, cacheable, on mBus

Hi, I have generated a new core according to the example you mentioned, but when I embedded this core in my own SOC demo (replacing vex_core with nax_core), the other bus interfaces remained unchanged. At this point, a situation occurred. When debugging with jtag and GDB, I found that:

Loading section_ Vector, size 0x118 lma 0x40000000

Loading section. memory, size 0x1702 lma 0x40000118

Loading section. text. startup, size 0x50c lma 0x4000181a

Loading section. rhodata, size 0x4b4 lma 0x40001d28

Loading section. data, size 0xc lma 0x400021dc

Start address 0x40000000, load size 8678

Transfer rate: 39 KB/sec, 1735 bytes/write

Write data starting from 0x40000000 through the terminal, for example:

Mww 0x40000000 0x12345678 24

I was able to successfully write the memory data, but when I analyzed the address through online logic and found that the captured address was actually 0x0000000,

May I ask if this is due to address offset within the core? Why is DDR reading and writing successful, but the data captured is inconsistent? thank you for your reply!

Dolu1990 commented 4 months ago

Hi,

I was able to successfully write the memory data, but when I analyzed the address through online logic and found that the captured address was actually 0x0000000,

I think that is by design, the mBus only support having a single memory region, and remove that offset from the address

May I ask if this is due to address offset within the core?

Inside the Nax SoC

Why is DDR reading and writing successful, but the data captured is inconsistent?

Which data ?

duanjiulon commented 4 months ago

Hi,  in other words, it means that I use jtag debugging and GDB to perform read and write operations inside the core. For example, the command  mww 0x40000000 0x12345678 32 means to write two data blocks to the memory unit starting from 0x400000000. When I read the address data in this interval, I also read this number successfully. However, when I use an online logic analyzer to capture the address of MBUS, the address range I capture is from 0x0000000, So I conclude whether there is a memory offset phenomenon here? Thank you for your guidance and reply!

发自我的iPhone

------------------ Original ------------------ From: Dolu1990 @.> Date: Wed,Mar 6,2024 6:17 PM To: SpinalHDL/NaxRiscv @.> Cc: duanjiulon @.>, Author @.> Subject: Re: [SpinalHDL/NaxRiscv] how to run a nax_core with a AXI4 interface(Issue #81)

Dolu1990 commented 4 months ago

Hi,

0x0000000, So I conclude whether there is a memory offset phenomenon here?

Yes, that is by design.

duanjiulon commented 4 months ago

Yes, that is by design.

Hi, I have successfully read and written DDR with your help, but I would like to run a dryone through DDR and print it out through the serial port. If I want to connect an APB bridge on the AXI interface and then connect an APB uart to the bridge's slave port, what would I like to ask you Should I use the pbus AXI or mbus AXI interface

  1. What are the good suggestions for the configuration parameters and memory allocation of the generated core based on this? I hope you can provide guidance. Looking forward to your reply!! Thank you very much!
Dolu1990 commented 4 months ago

Hi,

pbus => for peripherals (apb) mbus => main memory => axi only

configuration parameters

The important things are --scala-args='rvc=true,rvf=true,rvd=true,alu-count=2,decode-count=2'

memory allocation

What is that ?

duanjiulon commented 4 months ago

the important things are --scala-args='rvc=true,rvf=true,rvd=true,alu-count=2,decode-count=2'

Ahhhh, thank for your reply Why do ALU units need two? What I mean is, do the four parameters of the memory unit need to be changed?

Dolu1990 commented 4 months ago

Why do ALU units need two?

To get more performances, default being 1 ALU execution unit, 2 => more performance ^^

What I mean is, do the four parameters of the memory unit need to be changed?

One io,p for peripherals via pbus One rwxc,m for high performance memory access and execution via mbus that's the minimum required i think

duanjiulon commented 3 months ago

One io,p for peripherals via pbus One rwxc,m for high performance memory access and execution via mbus that's the minimum required i think

hi,dear Dolu,I'm planning to use litex+nax to run dhrystone recently, can you use the dhrystone you provide in NAXRISCV directly, or what should I pay attention to in the compiled elf file?