OVGN / OpenHBMC

Open-source high performance AXI4-based HyperRAM memory controller
Apache License 2.0
57 stars 12 forks source link

How DRU works? #5

Closed OVGN closed 2 years ago

OVGN commented 2 years ago

Hello!

This is some kind of discussion about DRU operation. I think internals of this thread will be soon added to IP core documentation.

Feel free to ask question)

OVGN commented 2 years ago

The main difficulty with HyperBUS interface (in case we do not use/have DCARS feature) is sampling valid data during read transactions.

Solution 1 (easiest and worst)

The most common way is to use HyperBUS clock CK_P/CK_N to sample data bus. This method is quite simple, but also gives the worst result. Unfortunately tCKD time vary too much, i.e. CK_P/CK_N to data valid window is not stable at all. Of couse, probably this is going to work at low frequencies, but there is no chance to get reliable operation at high frequencies.

tCKD_is_not_stable

Solution 2 (DLL based)

RWDS strobe is edge aligned to the incoming data DQ[7:0]. So we can shift RWDS to the center of the data eye. This method is much better than previous one, high frequency operation can be easily achieved. But unfortunately, there are some flaws, i.e. requirement of periodical calibration procedure. Also RWDS usually can be delayed by DLLs, that have some delay time limit, i.e. this is going to be impossible to shift RWDS right to the center of data when HyperBUS clock frequency is low.

Solution 3 (method used in OpenHBMC)

The idea of the OpenHBMC's DRU is to have durable PVT (Process, Temperature, Voltage) stable data recovery unit that do not need any kind of calibrations. DRU operation method is based on oversampling. Oversampling clock is 3x time faster than HyperBUS clock. RWDS and DQ[7:0] are sampled at both edges of the oversampling clock, so we have 6 samples for the period.

DRU algorithm is quite simple:

  1. Detect any RWDS edge.
  2. Next DQ[7:0] sample is valid.
  3. Repeat...

dru_principle

DRU do not sample data always at the center of the data eye, but guarantees to hit within sampling window, that in fact is much smaller than HyperRAM data valid window:

dru_sample_window

Sampling arrows movement shows that RWDS/DQ[7:0] can be fully asynchronous to the oversampling clock. At any phase relation DRU will capture data within samling window.

dru_sampling_animated

DRU operation was also proved in testbench, that will be published soon.

richardc87 commented 2 years ago

This is awesome, thanks for posthing this.