enjoy-digital / litex

Build your hardware, easily!
Other
2.87k stars 552 forks source link

Add support for the TileLink bus that is used in OpenTitan (and potentially the "comportability specification") #1544

Open mithro opened 1 year ago

mithro commented 1 year ago

I'm logging this bug because it is likely that Antmicro will be working on adding support for the integration of OpenTitan peripherals into LiteX and will probably want to also extend LiteX to support TileLink protocol (in a similar matter to the recent AXI and other bus support).

It seems OpenTitan peripherals use specifically the "TileLink Uncached Lightweight (TL-UL) variant". (The OpenTitan peripherals are suppose to follow the "comportability specification" found at https://docs.opentitan.org/doc/rm/comportability_specification/)

OpenTitan provide documentation for this at https://docs.opentitan.org/hw/ip/tlul/doc/ and say;

TL-UL is a lightweight bus that combines the point-to-point split-transaction features of the powerful TileLink (or AMBA AXI) 5-channel bus without the high pin-count overhead. It is intended to be about on par of pincount with APB but with the transaction performance of AXI-4, modulo the following assumptions.

  • Only one request (read or write) per cycle
  • Only one response (read or write) per cycle
  • No burst transactions
mithro commented 1 year ago

FYI - @kgugala @mgielda @pgielda

enjoy-digital commented 1 year ago

Great, you could probably first integrate the minimal required support for the design/functionnality you are targeting and allow LiteX to bridge to it first manually (Instance in User design), then automatically (by integrating in SoC.BusHandler.add_adapter). This will already allow creating SoC with the main bus in Wishbone/AXI(-Lite) and bridge to TileLink.

If useful in the future, we could also allow the main bus of the SoC to be a TileLink bus. When/If doing so, the fact that LiteX is able to automatically bridge between the different buses is useful to develop/integrate one TileLink module at a time and validate progressively on real SoCs. It has been very useful when adding/testing AXI-Lite/AXI support (and still is to workaround bugs waiting for a fix :)).

OrkunAliOzkan commented 1 year ago

Hello,

I hope you are all doing well. I am a regular user of LiteX, and I must say I love the direction it is heading in!

I wanted to inquire about TileLink bus support and its current status and if there have been any developments since Jan 3rd.

Could you kindly provide an update on TileLink bus support? Is it currently in development, on hold, or planned for a future release?

Thank you very much for your time and hard work on LiteX. All of your efforts are very appreciated!

Best Regards, Orkun

enjoy-digital commented 1 year ago

Hi @OrkunAliOzkan,

thanks for the feedback. I think there has been some initial development at Antmicro but this would need to be confirmed (@kgugala would probably know more).

Otherwise, not directly related to LiteX (but still a bit), @Dolu1990 is working on creating TileLink infrastructure for NaxRiscv and we'll probably want to integrate the recent NaxRiscv developments in LiteX in the near future. Not sure how NaxRiscv will be exposing the interfaces, but if it's in TileLink standard, we'll probably put more effort in LiteX to support it :)

OrkunAliOzkan commented 1 year ago

Hi @enjoy-digital

Thank you for the update on TileLink bus feature in my previous inquiry. I appreciate your response. I have a follow-up question specifically regarding TileLink infastructure support on the Rocket chip.

I understand different parties are working on developing different cores, but I was wondering if you could kindly clarify if there are any plans or considerations to adding TileLink support to Rocket as well

Best Regards, Orkun

gsomlo commented 1 year ago

rocket uses tilelink internally, to connect and route between the cpu cores, cache, dma logic, and the externally accessible axi ports.

i'm not entirely sure there's a good way to connect something like litex directly to the internal tilelink "fabric" of rocket, rather than using the available axi ports intended for that purpose...

Sent from my mobile.

OrkunAliOzkan commented 1 year ago

Hi @gsomlo,

Thank you for the clarification. I have a follow up question on the topic of the cache coherency in a LiteX SoC with a Rocket cpu.

I am trying to determine how cache coherency is managed in a LiteX SoC with a Rocket cpu, and in extension, how the L1 and LiteDRAM commuicate.

Is there any relevant documentation or specification which go into detail the coherency protocol used in Rocket cpus?

As well, why is there cache coherency between the Rocket L1 cache and the LiteX L2/LiteDRAM?

Thank you very much for your time and hard work porting the Rocket-core to LiteX.

Kind Regards, Orkun

gsomlo commented 1 year ago

rocket handles cache coherence internally, between the dedicated dma-slave and memory-master axi ports, and the cpu cores' internal L1.

this is the reason why litedram is connected directly to the axi-master mem port of rocket via a point to point axi link, and there is no litex-provided L2 cache.

i'm traveling and don't have easy access to documentation i could point to, but this is the short version, hope it helps.

Sent from my mobile.

OrkunAliOzkan commented 1 year ago

Hi @gsomlo,

Thank you very much for taking the time to respond to my questions. I truly appreciate your help, and I hope you're having a happy trip. I have a couple related questions related to L2 caches on the LiteX SoC.

I believe I may have a misconception on the existence of L2 cache on LiteX SoCs. Within the wishbone block there is a cache definition. If an SoC is configured such that LiteDRAM and the L1 cache have a direct axi connection, does it not have an L2 cache?

As well, I recall reading source code that specifies that if the port widths for the CPU and the LiteDRAM do not match, a wishbone connection is made, asides from the port widths conversion hardware, what additional hardware blocks are added? Would an L2 cache be one?

I understand that you're currently travelling, so please don't feel any pressure to respond immediately. Your insights have been invaluable, and I'm grateful for any further assistance you can provide, whenever it's convenient for you.

Kind Regards, Orkun

OrkunAliOzkan commented 1 year ago

hope it helps

This did, thank you very much. Ignoring my misconception on L2 cache existing on the Rocket cored SoC, and my lack of understanding of the coherency protocol used in Rocket, I feel much more understanding of how the SoC works. Thank you!

gsomlo commented 1 year ago

litex allows for an optional cache when (and where) litedram is attached to a bus (called "L2" by convention afaict); that cache isn't coherent with anything internal to the cpu core(s), so using it with rocket didn't make sense.

Sent from my mobile.

OrkunAliOzkan commented 1 year ago

Hi @gsomlo,

Thank you very much. This chat has helped ramp my understanding up quite well. I had one last question regarding how the SoC and the CPU interfaces.

I want to understand how the cpu and the soc interact with eachover, since the cpu is strictly verilog code and the soc is written in python/migen. I imagine there must be some type of wrapper file.

Do you know where in the source code the wrapper file for the rocket-core is?

Kind Regards, Orkun

OrkunAliOzkan commented 1 year ago

https://github.com/enjoy-digital/litex/wiki/Reuse-a-(System)Verilog,-VHDL,-(n)Migen,-Spinal-HDL,-Chisel-core

gsomlo commented 1 year ago

Do you know where in the source code the wrapper file for the rocket-core is?

https://github.com/enjoy-digital/litex/blob/master/litex/soc/cores/cpu/rocket/core.py

Sent from my mobile.

OrkunAliOzkan commented 1 year ago

Hi @gsomlo,

I had a query regarding the lack of an L2 cache in the Rocket core module.

Rocket offers L2 support through the SiFive InclusiveCache to produce a shared L2 cache (see documentation here. To my understanding the Rocket core used in LiteX does not have an L2 cache, was this a design decision, potentially due to the port widths preventing a direct AXI connection between the Rocket memory interface to the LiteX DRAM interface, or just a feature not yet added to the Rocket module used on LiteX?

Thank you for your patience with my questions. Kind Regards,

Orkun

gsomlo commented 1 year ago

On Mon, Jul 24, 2023 at 09:07:41AM -0700, Orkun Ozkan wrote:

I had a query regarding the lack of an L2 cache in the Rocket core module.

Rocket offers L2 support through the SiFive InclusiveCache to produce a shared L2 cache

The documentation you link to is related to Chipyard (which I'm not familiar with, but if I had to guess it sits at the same level of abstraction as LiteX itself -- i.e., it's a "harness" that ties together multiple IP blocks from various sources, maybe more sifive-specific sources in the case of chipyard).

If I had to guess, they probably hook the L2 directly into the TileLink fabric that's internal to the rocket chip, otherwise I'm not sure how they can accomplish coherence with the per-core L1 caches and with the DMA interface. Not sure how that could work with LiteX, unless we 1. implement tilelink support, and 2. break the rocket chip's "encapsulation" to connect to its insides in a way similar to what I suspect Chipyard is doing (if I'm blatantly wrong about any of these guesses, I'd appreciate someone correcting me, btw!)

OrkunAliOzkan commented 1 year ago

@gsomlo

Could you not simply just add the inclusive cache generator within the Config.scala file within the update script for pythondata-cpu-rocket? Would it not be a drop-in replacement for the broadcast-hub BankedL2Key which you instead have generated through WithCoherentBusTopology? I understand that after all of the rocket memory there is a litex memory port to interface with litedram, would that connect to the inclusive cache?

gsomlo commented 1 year ago

On Thu, Aug 03, 2023 at 04:04:58AM -0700, Orkun Ozkan wrote:

Could you not simply just [...]

:)

So I did a bit of digging, and chipyard includes https://github.com/chipsalliance/rocket-chip-inclusive-cache as a submodule, and uses some Scala/Chisel/SBT magic to insert the L2 cache inbetween the TileLink fabric and the externally visible Mem AXI port of the RocketChip "assembly".

We'd have to replicate just that bit of build environment magic to get the Rocket generator to elaborate verilog that also includes the rocket-chip-inclusive-cache bits at just the right place, with just the right settings.

This would be an exercise in figuring out enough Chisel/Scala/SBT/mill/whatever-else-is-cool-this-week to figure out how Chipyard is pasting the inclusive-cache into the insides of RocketChip, in front of the externally exposed Mem AXI port (most likely starting with Chipyard's AbstractConfig class, where the magic seems to happen), and replicate that for pythondata-cpu-rocket's build script, which currently simply builds the RocketChip sources as-is, using make, as a black box.

Given that we're running things at 50-100 MHz, I'm not sure how much of a "performance boost" we'd get from adding an L2 cache, and whether that would justify all of this effort (not to mention adding L2 will make Rocket harder to fit on some of the lower-end FPGAs, like e.g. ECP5).

OrkunAliOzkan commented 1 year ago

Hi @gsomlo,

Through adding L2 cache to Rocket-core and building SoC ~/litex_environment/litex-boards/litex_boards/targets/digilent_nexys_video.py --build --cpu-type rocket --cpu-variant full --cpu-num-cores 1 --cpu-mem-width 2 --sys-clk-freq 50e6 --with-ethernet --with-sdcard --with-sata --sata-gen 1 --doc --csr-json ./csr.json, the SoC fails at begining the Write memtest after switching SDRAM to hardware control, would you have any insight into next-steps I could take?