enjoy-digital / litedram

Small footprint and configurable DRAM core
Other
365 stars 115 forks source link

Core Question #219

Open vasimr opened 3 years ago

vasimr commented 3 years ago

Hi,

Can the DRAM controller core be generated without a CPU inside? I would like to have an external CPU that performs the initialization and calibration.

Thanks in advance.

enjoy-digital commented 3 years ago

Hi @vasimr,

yes it's possible to do so, if you are looking for similar example, you can look https://github.com/antonblanchard/microwatt/tree/master/litedram/gen-src. Microwatt is integrating LiteDRAM in a similar way: the core is generated withlitedram_gen without a CPU ("cpu": "None") and Microwatt is then doing the initialization through the Wishbone interface that is exposed.

vasimr commented 3 years ago

Perfect, thanks! That worked perfectly! One question regarding using an external interface. Does the time between writes matter for initialization and calibration? Or can it be done at the leisure of the external CPU?

I have a few additional quick questions if you don't mind (I can email them instead or create another issue if you prefer). 1) Is there a way to set the RAM module parameters via the WB bus instead of having them pre-defined? (could be helpful for plug+play DIMM compatibility) 2) Are there any additional clocking / PLL / MMCM constraints that need to be considered when using a stand alone controller? 3) Is there a way to generate two controllers (for a dual channel configuration) and have them share the same interface clock? (generating two verilog files is fine, but I would like their native / wishbone / AXI to have the same clock to avoid additional clock bridging) 4) Are 64/128-bit AXI / Wishbone buses supported? 5) What's the performance of the multi-port bridge? Does it handle bursting (e.g. for AXI)? Will it have a substantial performance hit or would it be better to build one myself off of the native interface?

enjoy-digital commented 3 years ago

The only requirement for the initialization/calibration is to do it before using the memory :)

For the others questions: 1) Not yet, we have plans for that, but for now the parameters are not dynamic but defined at build time. 2) You need to be sure to constraint the input clock correctly. The toolchain will then be able to propagate/derivate the other constraints. For the electrical constraints, please look at the one defined inlitex-boards platforms for similar boards/configurations: https://github.com/litex-hub/litex-boards/tree/master/litex_boards/platforms 3) yes it should be fine, it not please report it and we'll try to fix that together. (i'm also planning to test similar configurations in the next weeks). 4) For AXI it's only supporting the native controller's data width for now (there are plans to support data-width adaptation) but for Wishbone this should already be possible. (eventually using LiteX's WishboneConverter https://github.com/enjoy-digital/litex/blob/master/litex/soc/interconnect/wishbone.py#L303). 5) The arbiter is currently simple (RoundRobin on each banks) but should be convenient for most use cases. We have plans to improve it with https://github.com/enjoy-digital/litedram/issues/209 that should be implemented in the next weeks. So maybe you could start with it and get back to individuals ports if performance is not satisfying for your use case.

vasimr commented 3 years ago

Thanks for the answers!

  1. I'm not sure what I should be looking at there. They mostly look like pin IO constraints. Is that all that's needed for the tool to infer the correct settings?
  2. What would be the best way to go about implementing a shared clock? Should the PLL be removed from one of the modules and then fed the same signals from the other module? Would that be best done with only the user_clock (I believe it's connected to the 1st buffer) and then still instantiate two PLLs? should the clock be taken before or after the buffer? (before could cause the two clocks to go out of sync, after could put too much load on a the clock buffer) 4./5. I didn't notice that, but now I see that they are all the native width, so I will probably have to create a bridge / arbiter myself, which is fine. Thinking about it, I need something more niche than round-robin. I would start with the default one, however, for an x32 device, a 256-bit AXI / wishbone bus is a bit too wide
  3. Can the IO to sysclk ratio be changed? If I understand correctly, it's 4:1 by default, however, I think 2:1 would provide better results for my application (allows for a higher bus frequency while dropping the DRAM IO frequency)