OVGN / OpenHBMC

Open-source high performance AXI4-based HyperRAM memory controller
Apache License 2.0
56 stars 12 forks source link

ISERDESE2 reset problem #8

Open mohammadhgh opened 2 years ago

mohammadhgh commented 2 years ago

Hello

I use a 64 Mbit Hyper-RAM as Microblaze main memory and use a SREC boot-loader to load the program from a QSPI flash memory. Most of the time everything is OK but in some cases after loading the Hyper-RAM with the application file, reading first instruction from Hyper-RAM fails and program halts.

I captured the issue with ILA core and saw that the problem is because of wrong value (all ones) at the output of the RWDS ISERDESE2. This happens on the first time read after the power up. Sometimes, two clock cycles after reset deassertion, an output of all ones comes out of the RWDS hbmc_iobuf which causes the DRU unit to capture an extra 16-bit word before the actual burst transfer begins.

Screenshot from 2022-07-27 17-08-52

According to UG471,

After deassertion of reset, the output is not valid until after two CLKDIV cycles.

So I added an iserdes_q_invalid flag to ignore the ISERDESE2 output for two clock cycles of iserdes_clkdiv after arst falling edge.

diff --git a/OpenHBMC/hdl/hbmc_iobuf.v b/OpenHBMC/hdl/hbmc_iobuf.v
index 1df7ccf..77296db 100644
--- a/OpenHBMC/hdl/hbmc_iobuf.v
+++ b/OpenHBMC/hdl/hbmc_iobuf.v
@@ -55,6 +55,9 @@ module hbmc_iobuf #
     wire            iserdes_d;
     wire    [5:0]   iserdes_q;
     wire            iserdes_ddly;
+
+    reg             arst_shift_reg      [0:1];
+    wire            iserdes_q_invalid;

 /*----------------------------------------------------------------------------------------------------------------------------*/
@@ -221,6 +224,24 @@ module hbmc_iobuf #
         .SHIFTIN2       ( 1'b0              )
     );

+/*----------------------------------------------------------------------------------------------------------------------------*/
+
+    /* ISERDESE2 reset extender 
+     * According to UG471, ISERDESE2 output is invalid for two clock cycles after reset deassertion.
+     */
+
+    always @(posedge iserdes_clkdiv or posedge arst) begin
+        if (arst) begin
+            arst_shift_reg[0] <= 1'd1;
+            arst_shift_reg[1] <= 1'd1;
+        end else begin
+            arst_shift_reg[0] <= 1'd0;
+            arst_shift_reg[1] <= arst_shift_reg[0];
+        end
+    end
+
+    assign  iserdes_q_invalid = arst_shift_reg[1];
+
 /*----------------------------------------------------------------------------------------------------------------------------*/

     /* Register ISERDESE2 output */
@@ -228,7 +249,11 @@ module hbmc_iobuf #
         if (arst) begin
             iserdes_o <= {6{1'b0}};
         end else begin
-            iserdes_o <= iserdes_q;
+            if (iserdes_q_invalid) begin
+                iserdes_o <= {6{1'b0}};
+            end else begin
+                iserdes_o <= iserdes_q;
+            end
         end
     end

This seems to work and the boot problem is now resolved.

@OVGN To solve this I had to deep dive into the code and I have to say you have done a wonderful job, thank you very much for sharing it! :+1:

OVGN commented 2 years ago

Hello, @mohammadhgh!!

Thank you so much for your efforts in investigation of this issue! You did a great job too!

I researched this issue for several day, trying to catch this bug in simulation. Finally, I could do that. Results are quite interesting.

There is a recomendation from Cypress to have a weak pull-down at the RWDS line: https://community.infineon.com/t5/Knowledge-Base-Articles/Recommended-Configuration-for-RWDS-Pin-in-HyperBus-Memory-KBA224897/ta-p/252853

Actually, I have this weak pullup on my target board with HyperRAM memory: hyperRam_rdws_pulldown_scheme

Also in my testbench I simulate this pull-down over this code:

pulldown (weak0) (hbmc_rwds);

That's why I didn't face this issue not during simulation, nor at the hardware. On my simulation waves you can see that there is no wrong data at the ISERDES output like in your case: rwds_pull_down

I decided to comment this pull-down to let the RWDS float, and....got this: rwds_floating

If I change RWDS pull-down to pull-up I will get extra 16-bit word like you: rwds_pull_up

Yes, you are right, ISERDES outputs have invalid data within first two CLKDIV cycles after reset, but if we fix only this, we will not remove all X-states at the ISERDES outputs: rwds_floating_with_iserdes_reset_fix (Comparing with previous waveform, you see X here instead of 1's as I disabled RWDS pull-up/down) I believe that this probably is going to work somehow for you, hb_recov_data_vld looks good, but I'm worring about X-states at the ISERDES outputs after the end of the burst. This doesn't look reliable for me.

Though my design is working well with pull-down at the RWDS, anyway I think this is a HDL bug, as you can hardly find such pull-down recomendation for RWDS in any other HyperRAM part datasheet or even PCB guide from Cypress (Infenion): https://www.infineon.com/dgdl/Infineon-AN211622_HyperFlash_and_HyperRAM_Layout_Guide-ApplicationNotes-v03_00-EN.pdf?fileId=8ac78c8c7cdc391c017d0d240bc961f5

I'm going to make a more reliable fix. I think that if I find a way to drive ISERDES reset properly, I will be able to remove all X-states even if RDWS input do not have pull-down resistor.

@mohammadhgh, thanks again for finding this bug! I'm working on this issue and going to fix it soon.

Best regards, OVGN

AnttiLukats commented 1 year ago

Just a question is this BUG fixed finally or not?

OVGN commented 1 year ago

Just a question is this BUG fixed finally or not?

Hello! Yes, I have a fix, though I haven't committed it yet... Need a bit more time to tests. But this is a bug, I'm going to fix first.