Closed kowalewskijan closed 3 years ago
Thanks for the report and investigation. This was introduced to avoid errors with upstream Yosys. I will have a closer look at the generated code with or without this commit.
I made some more tests with the latest LiteX and it looks like this commit could be just one element of a more complex bug. When I built the latest LiteX with and without mentioned commit memtest
failed anyway. Only when I reverted changes to revisions mentioned in issue and then reverted the commit memtest
worked again. So I think the problem is much more complex and mentioned commit is not the ultimate fix for it.
Thanks @kowalewskijan for the feedback, i also did some test on the KCU105 and had the same conclusion. I'm going to investigate more.
@kowalewskijan: i'm no longer able to reproduce the issue on the KCU105. With upstream LiteX/LiteDRAM on the KCU105, do you still have a Command/Clk scan reporting only zeroes? Can you also try to lower the sys_clk_freq to 100MHz to see if it's working?
OK thanks for the test/results. When testing on the KCU105, memtest was passing at 125MHz but the read leveling scan not good on one module. I'll investigate on this and will probably ask you do some tests on the ZCU104 once i'll have improved things on the KCU105.
For now you can use 100MHz clock on the ZCU104.
I am sorry but I deleted the original post because I had a doubt that I changed the clock correctly. I have still zeros using the latest code, but I think I kind of tricked Vivado of clock value instead of actually changing it. What I did:
default_clk_period
in platform like this:
default_clk_name = "clk125"
default_clk_period = 1e9/100e6
self.add_period_constraint(self.lookup_request("clk125", loose=True), 1e9/100e6)
Correct me if I'm wrong, but I actually tell Vivado that the clock has different value than it actually is?
@kowalewskijan: in fact to change the frequency, you just need to modify sys_clk_freq
here: https://github.com/litex-hub/litex-boards/blob/master/litex_boards/targets/zcu104.py#L54 to 100e6
.
Thanks, I changed sys_clk_freq
to 100e6
, but I had to change also pll.create_clkout(self.cd_clk500, 400e6, with_reset=False)
from 500e6
to 400e6
to avoid error:
File "litex/litex/soc/cores/clock.py", line 153, in compute_config
raise ValueError("No PLL config found")
ValueError: No PLL config found
This is my clock summary when I did changes mentioned above:
------------------------------------------------------------------------------------------------
| Clock Summary
| -------------
------------------------------------------------------------------------------------------------
Clock Waveform(ns) Period(ns) Frequency(MHz)
----- ------------ ---------- --------------
clk125_p {0.000 4.000} 8.000 125.000
main_clkout1 {0.000 1.241} 2.483 402.778
pll4x_clk {0.000 1.241} 2.483 402.778
pll4x_clk_DIV4_INV {4.966 9.931} 9.931 100.694
sys_clk {0.000 4.966} 9.931 100.694
Still leveling and memtest fails. Only when I changed these constraints in platform
which I mentioned in previous post, I got a success. I will investigate more.
@enjoy-digital As noted in https://github.com/enjoy-digital/litedram/pull/204#issue-426572558 I encountered some problems while testing DDR4 SPD parser.
I did some more tests today with @kowalewskijan and the results are a bit confusing. First tests were build with the commit reverted. When changing nothing, only tCCD
or only tFAW
, leveling failed, but when both timings were changed to the values from SPD data, then the leveling succeeded (zcu104_logs.zip). Without reverting that commit, tfaw_tccd
still failed.
We've also tested with full module parameters generation from SPD for both MTA4ATF51264HZ
and KVR21SE15S84
and for the first one leveling worked. However, when modifying MTA4ATF51264HZ
parameters directly in modules.py
to match thoes from SPD data, this failed. And the only real difference in verilog was that one gateware had SPD data stored in ROM and the other one didn't.
So all these results seem a bit random to me and I am not sure if this will help much. I've pushed the changes to our repos, maybe they will be helpful: https://github.com/antmicro/litex-boards/tree/zcu104-spd - ZCU104 command line arguments https://github.com/antmicro/litedram/tree/zcu104-spd - module parameters
I did some more extensive tests. Here are results:
Test case No. | RAM module | Description | Result |
---|---|---|---|
1. | MTA4ATF51264HZ | increased tRC by 1 cycle, rate=1:4 | Passed |
2. | MTA4ATF51264HZ | increased tRC by 1 cycle, rate=1:2 | Failed (1 module failed - m3) |
3. | MTA4ATF51264HZ | increased tRC by 1 cycle, rate=1:1 | Failed (all modules failed) |
4. | MTA4ATF51264HZ | rate=1:4 | Failed (all modules failed) |
5. | MTA4ATF51264HZ | rate=1:2 | Failed (modules m2 and m3 failed) |
6. | MTA4ATF51264HZ | rate=1:1 | Passed |
7. | KVR21SE15S84 | increased tRC by 1 cycle, rate=1:4 | Failed (only m5 module passed) |
8. | KVR21SE15S84 | increased tRC by 1 cycle, rate=1:2 | Failed (only m5 module passed) |
9. | KVR21SE15S84 | increased tRC by 1 cycle, rate=1:1 | Failed (only m5 module passed) |
10. | KVR21SE15S84 | rate=1:4 | Failed (only m5 module passed) |
11. | KVR21SE15S84 | rate=1:2 | Failed (all modules failed) |
12. | KVR21SE15S84 | rate=1:1 | Failed (only m5 module passed) |
13. | KVR21SE15S84 | increased tRC by 5 cycles, rate=1:4 | Failed (only m5 module passed) |
14. | KVR21SE15S84 | fine_refresh_rate=2x, rate=1:4 | Failed (only m5 module passed) |
15. | KVR21SE15S84 | speedgrade=-1, rate=1:4 | Failed (only m5 module passed) |
I investigated cycles calculation functions, code around timing controllers, BIOS software for leveling and memtest and I did some simulations but without any hint where the bug hides. But I believe it is associated with timings somehow. I got success with MTA4ATF51264HZ as mentioned in the table with this configuration
Thanks @kowalewskijan for the results, what's the actual speedgrade of the MTA4ATF51264HZ that is used, 2666? Have you also checked the speedgrade of the KVR21SE15S8?
For the rate, i don't think it's worth iterating on it since it should be 1:4 for DDR4.
During the calibration, we are generating simple access patterns to the controller and not really stressing the controller, so even with small errors on timings, the calibration could pass. (but the memtest would fail). So there is probably something else. I'll also do more tests on a Ultrascale boards when i'll have more time.
Just in case, could you try commenting out this: https://github.com/enjoy-digital/litedram/blob/master/litedram/phy/usddrphy.py#L534-L535 and see it you have the same behavior?
@enjoy-digital I can confirm that MTA4ATF51264HZ has 2666 and KVR21SE15S8 has 2133 speedgrade in my setup. I generated bitstreams with commented out lines you had mentioned. For both Kingston and Micron RAMs leveling and memtest failed for all modules.
As tested recently, and with the recent improvements, the ZCU104 is now calibrating correctly:
__ _ __ _ __
/ / (_) /____ | |/_/
/ /__/ / __/ -_)> <
/____/_/\__/\__/_/|_|
Build your hardware, easily!
(c) Copyright 2012-2020 Enjoy-Digital
(c) Copyright 2007-2015 M-Labs
BIOS built on Dec 14 2020 11:10:14
BIOS CRC passed (b1311357)
Migen git sha1: 11a297f
LiteX git sha1: 649edd18
--=============== SoC ==================--
CPU: VexRiscv @ 125MHz
BUS: WISHBONE 32-bit @ 4GiB
CSR: 32-bit data
ROM: 32KiB
SRAM: 8KiB
L2: 8KiB
SDRAM: 1048576KiB 64-bit @ 1000MT/s (CL-9 CWL-9)
--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write leveling:
Cmd/Clk scan (0-334)
|000000 |0000 |0000 |0000| best: -1
Setting Cmd/Clk delay to -1 taps.
Data scan:
m0: |1110000000000000000011| delay: -
m1: |1110000000000000000011| delay: -
m2: |1111100000000000000000| delay: -
m3: |1111100000000000000000| delay: -
m4: |1111111000000000000000| delay: -
m5: |1111111100000000000000| delay: -
m6: |1111111111000000000000| delay: -
m7: |1111111100000000000000| delay: -
Write latency calibration:
m0:6 m1:6 m2:6 m3:6 m4:6 m5:6 m6:6 m7:6
Read leveling:
m0, b0: |00000000000000000000000000000000| delays: -
m0, b1: |00000000000000000000000000000000| delays: -
m0, b2: |11000000000000000000000000000000| delays: 09+-09
m0, b3: |00001111111111111111000000000000| delays: 180+-128
m0, b4: |00000000000000000000001111111111| delays: 428+-83
m0, b5: |00000000000000000000000000000000| delays: -
m0, b6: |00000000000000000000000000000000| delays: -
m0, b7: |00000000000000000000000000000000| delays: -
best: m0, b03 delays: 180+-127
m1, b0: |00000000000000000000000000000000| delays: -
m1, b1: |00000000000000000000000000000000| delays: -
m1, b2: |11000000000000000000000000000000| delays: 09+-09
m1, b3: |00001111111111111111000000000000| delays: 181+-127
m1, b4: |00000000000000000000001111111111| delays: 428+-84
m1, b5: |00000000000000000000000000000000| delays: -
m1, b6: |00000000000000000000000000000000| delays: -
m1, b7: |00000000000000000000000000000000| delays: -
best: m1, b03 delays: 182+-126
m2, b0: |00000000000000000000000000000000| delays: -
m2, b1: |00000000000000000000000000000000| delays: -
m2, b2: |00000000000000000000000000000000| delays: -
m2, b3: |01111111111111111000000000000000| delays: 132+-127
m2, b4: |00000000000000000001111111111111| delays: 403+-109
m2, b5: |00000000000000000000000000000000| delays: -
m2, b6: |00000000000000000000000000000000| delays: -
m2, b7: |00000000000000000000000000000000| delays: -
best: m2, b03 delays: 130+-127
m3, b0: |00000000000000000000000000000000| delays: -
m3, b1: |00000000000000000000000000000000| delays: -
m3, b2: |00000000000000000000000000000000| delays: -
m3, b3: |01111111111111111000000000000000| delays: 138+-129
m3, b4: |00000000000000000001111111111111| delays: 407+-105
m3, b5: |00000000000000000000000000000000| delays: -
m3, b6: |00000000000000000000000000000000| delays: -
m3, b7: |00000000000000000000000000000000| delays: -
best: m3, b03 delays: 139+-130
m4, b0: |00000000000000000000000000000000| delays: -
m4, b1: |00000000000000000000000000000000| delays: -
m4, b2: |00000000000000000000000000000000| delays: -
m4, b3: |11111111111110000000000000000000| delays: 101+-101
m4, b4: |00000000000000001111111111111111| delays: 369+-126
m4, b5: |00000000000000000000000000000000| delays: -
m4, b6: |00000000000000000000000000000000| delays: -
m4, b7: |00000000000000000000000000000000| delays: -
best: m4, b04 delays: 368+-126
m5, b0: |00000000000000000000000000000000| delays: -
m5, b1: |00000000000000000000000000000000| delays: -
m5, b2: |00000000000000000000000000000000| delays: -
m5, b3: |11111111111100000000000000000000| delays: 92+-92
m5, b4: |00000000000000011111111111111100| delays: 353+-125
m5, b5: |00000000000000000000000000000000| delays: -
m5, b6: |00000000000000000000000000000000| delays: -
m5, b7: |00000000000000000000000000000000| delays: -
best: m5, b04 delays: 351+-125
m6, b0: |00000000000000000000000000000000| delays: -
m6, b1: |00000000000000000000000000000000| delays: -
m6, b2: |00000000000000000000000000000000| delays: -
m6, b3: |11111111110000000000000000000000| delays: 78+-78
m6, b4: |00000000000011111111111111110000| delays: 317+-127
m6, b5: |00000000000000000000000000000001| delays: 495+-16
m6, b6: |00000000000000000000000000000000| delays: -
m6, b7: |00000000000000000000000000000000| delays: -
best: m6, b04 delays: 319+-128
m7, b0: |00000000000000000000000000000000| delays: -
m7, b1: |00000000000000000000000000000000| delays: -
m7, b2: |00000000000000000000000000000000| delays: -
m7, b3: |11111111100000000000000000000000| delays: 65+-65
m7, b4: |00000000000111111111111111100000| delays: 290+-133
m7, b5: |00000000000000000000000000000111| delays: 483+-28
m7, b6: |00000000000000000000000000000000| delays: -
m7, b7: |00000000000000000000000000000000| delays: -
best: m7, b04 delays: 292+-131
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
Write: 0x40000000-0x40200000 2MiB
Read: 0x40000000-0x40200000 2MiB
Memtest OK
Memspeed at 0x40000000 (2MiB)...
Write speed: 38MiB/s
Read speed: 33MiB/s
--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Timeout
No boot medium found
--============= Console ================--
litex> sdram_test
Memtest at 0x40000000 (32MiB)...
Write: 0x40000000-0x42000000 32MiB
Read: 0x40000000-0x42000000 32MiB
Memtest OK
litex>
Some improvements can still be done on the Cmd/Clk scan but this will be addressed separately.
I used the latest code to build Litex for ZCU104 board but in the result I got a following error during
memtest
:Versions:
Build cmd:
python litex-boards/litex_boards/targets/zcu104.py --cpu-type vexriscv --build
I did some research and I found out that when I reverted just this commit
memtest
started to work.