enjoy-digital / litedram

Small footprint and configurable DRAM core
Other
365 stars 115 forks source link

DFI rate converter #258

Closed jedrzejboczar closed 2 years ago

jedrzejboczar commented 3 years ago

This PR change set is on top of https://github.com/enjoy-digital/litedram/pull/257, please use the following diff to see the relevant changes: https://github.com/antmicro/litedram/compare/jboc/lpddr5-split...jboc/dfi-converter This is marked as draft as it includes an example PHY wrapper (which is used by custom target in litex-boards) and requires minor fix in litex (see below). I also wonder what you think about this and if you have any suggestions. I would appreciate any feedback.

The converter is meant as a module that may be useful for wrapping PHYs to adjust MC-to-DRAM frequency relation. The DFIRateConverter module allows to wrap DFI with a new DFI at N times lower frequency with N times more phases (dfi_databits is also reduced N times so that overall data bits in a burst stays intact).

This PR includes a set of tests under test/test_dfi.py and the converter class is defined here: https://github.com/antmicro/litedram/blob/jboc/dfi-converter/litedram/phy/dfi.py#L105.

An example PHY wrapper is here: https://github.com/antmicro/litedram/blob/jboc/dfi-converter/litedram/phy/s7ddrphy.py#L504. It wraps S7DDRPHY making it a 1:8 PHY. The target file that I used for testing can be found here: https://github.com/antmicro/litex-boards/blob/jboc/dfi-rate-converter/litex_boards/targets/digilent_arty.py (it replaces PHY class, uses 1:8 for module ratio, has updated CRG with synchronized reset and some helpers for debugging). The PHY wrapper adjusts PhySettings and creates the original PHY in different clock domain. For this reason the original S7DDRPHY required modifications to be able to add proper CDC to CRGs (see this commit).

There is an issue with how BIOS firmware checks the DFI data with databits < 32, which I temporarily fixed using with this change. It is related to the issue I described earlier in https://github.com/enjoy-digital/litedram/issues/256.

Currently there is a limitation regarding rdphase/wrphase which cannot be changed at runtime. This is because of write_delay/read_delay (see the docstring), maybe it would be possible to support changes at runtime, though it would require investigating.

Here is the output on Arty at 50 MHz sys, 800 MT/s:

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2021 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Jun 22 2021 12:25:01
 BIOS CRC passed (2152352f)

 Migen git sha1: 3ffd64c
 LiteX git sha1: f15b3967

--=============== SoC ==================--
CPU:            VexRiscv @ 50MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data
ROM:            128KiB
SRAM:           8KiB
L2:             0KiB
SDRAM:          262144KiB 16-bit @ 800MT/s (CL-6 CWL-5)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write latency calibration:
m0:0 m1:0
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00000000000001111111111111100000| delays: 19+-06
  m0, b02: |00000000000000000000000000000111| delays: 30+-01
  m0, b03: |00000000000000000000000000000000| delays: -
  m0, b04: |00000000000000000000000000000000| delays: -
  m0, b05: |00000000000000000000000000000000| delays: -
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b01 delays: 19+-06
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00000000000001111111111111000000| delays: 19+-06
  m1, b02: |00000000000000000000000000000011| delays: 30+-01
  m1, b03: |00000000000000000000000000000000| delays: -
  m1, b04: |00000000000000000000000000000000| delays: -
  m1, b05: |00000000000000000000000000000000| delays: -
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b01 delays: 19+-06
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (2.0MiB)...
  Write speed: 16.8MiB/s
   Read speed: 8.1MiB/s
enjoy-digital commented 3 years ago

Thanks @jedrzejboczar, I need a bit of time to review it, I'll try to provide some feedback soon. (Since we are in the middle of a validation phase with a client, we'll maybe wait a bit before merging).

jedrzejboczar commented 2 years ago

I have tested your fix from https://github.com/enjoy-digital/litedram/issues/256 on the design with 1:8 DDR3 on Arty and I noticed that we would need a small modification to make it work.

When adding the DFI rate converter on top of S7DDRPHY, we double the number of DFI phases and decrease DFI databits by half (DFI modification, new PhySettings). But in current setup DFII_PIX_DATA_BYTES (=SDRAM_PHY_DATABITS*SDRAM_PHY_XDR/8) will still maintain the same value, because PhySettings.databits doesn't change (only PhySettings.dfi_databits). Using real numbers, on default Arty we have DFII_PIX_DATA_BYTES=16*2/8=4, while with the wrapper it should be 2. Maybe instead of calculating DFII_PIX_DATA_BYTES in code we could generate it as a constant in sdram_phy.h using the value of PhySettings.dfi_databits/8?

But even setting this to 2 is not enough. This is because of this part of sdram_write_read_check_test_pattern. Because we still have 2 modules, this will try to index prs[p] from 0 to 3. As far as I understand there are these 2 comparisons due to the fact that on DDR memories we always have double the databits on DFI. A partial solution would be to rewrite it as:

for (int i = 0; i < DFII_PIX_DATA_BYTES/(SDRAM_PHY_DATABITS/8); ++i) {
    errors += popcount(prs[p][(i+1)*SDRAM_PHY_MODULES-1-module] ^ tst[(i+1)*SDRAM_PHY_MODULES-1-module]);
}

but then this will only allow for DFI rate converter ratio=2. With ratio=4 we would get DFII_PIX_DATA_BYTES/(SDRAM_PHY_DATABITS/8) = 1/2 = 0.

I tried to visualize the mapping of DFI data bytes for this example with the following:

normal: 4 phases, 16 databits, 32 dfi_databits
| p0                | p1                | p2                | p3                |
| XDR.0   | XDR.1   | XDR.0   | XDR.1   | XDR.0   | XDR.1   | XDR.0   | XDR.1   |
| m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 |
final data width = 4*4 bytes = 128 bits

DFI rate converter ratio=2: 8 phases, 16 databits, 16 dfi_databits
| p0      | p1      | p2      | p3      | p4      | p5      | p6      | p7      |
| m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 | m0 | m1 |
final data width = 8*2 bytes = 128 bits

DFI rate converter ratio=4: 16 phases, 16 databits, 8 dfi_databits
| p0  | p1  | p2  | p3  | p4  | p5  | p6  | p7  | p8  | p9  | p10 | p11 | p12 | p13 | p14 | p15 |
| m0  | m1  | m0  | m1  | m0  | m1  | m0  | m1  | m0  | m1  | m0  | m1  | m0  | m1  | m0  | m1  |
final data width = 16*1 bytes = 128 bits

Based on this the following code should be more general:

errors = 0;
for(p=0;p<SDRAM_PHY_PHASES;p++) {
    /* Read back test pattern */
    csr_rd_buf_uint8(sdram_dfii_pix_rddata_addr(p), tst, DFII_PIX_DATA_BYTES);
    /* Verify bytes matching current 'module' */
    for (int i = 0; i < DFII_PIX_DATA_BYTES; ++i) {
        int j = p * DFII_PIX_DATA_BYTES + i;
        if (j % SDRAM_PHY_MODULES == SDRAM_PHY_MODULES-1-module) {
            errors += popcount(prs[p][i] ^ tst[i]);
        }
    }
}

One thing I am not sure is why I had to use the (j % SDRAM_PHY_MODULES == SDRAM_PHY_MODULES-1-module) condition instead of (j % SDRAM_PHY_MODULES == module). It seems that modules order is reverted.

I've tested such solution (with #define DFII_PIX_DATA_BYTES SDRAM_PHY_DFI_DATABITS/8 and SDRAM_PHY_DFI_DATABITS being generated in sdram_phy.h) on the regular S7DDRPHY on Arty and with DFI converter ratio=2 and it worked. If you think it would be a good solution than I can prepare PR to LiteX with the changes.

jedrzejboczar commented 2 years ago

Closing this, as I opened a new PR in https://github.com/enjoy-digital/litedram/pull/269.