Closed trabucayre closed 2 months ago
Hi ^^ @trabucayre
By curiousity, what was the frequency of the scope_clk ?
Hi, around 100MHz (gtkwave time is wrong :) )
@Dolu1990: This behaviour was in fact related to an issue in Quartus 24.1. Quartus 24.2 fixes this issue. @trabucayre is now looking at briging the LiteX 32-bit AXI interface to the 256-bit AXI interface of the DRAM controller and will share updates here.
litex> mem_write 0x80000000 0xcafebabe
litex> mem_write 0x80000008 0xAABBCCDD
litex> mem_read 0x80000008
Memory dump:
0x80000008 be ba fe ca ....
litex> mem_read 0x80000010
Memory dump:
0x80000010 be ba fe ca ....
Write sequence looks fine:
But read operation seems to access first word
Same bitstream but with:
litex> mem_read 0x80000040
Memory dump:
0x80000040 62 5b 5b db b[[.
Adress seems fine but issue is related to 32b <-> 256b select
Thanks @trabucayre, the accesses on the 256-bit bus seems fine and issue indeed seems related to 32-bit data selection from the returned 256-bit. It would be worth seeing if this behaves similarly on the simulation (just return a dummy 256-bit value and see if selection is correct).
@trabucayre: The default values of the adapter will probably have to be adusted: https://github.com/enjoy-digital/litex_verilog_axi_test/blob/master/verilog_axi/axi/axi_adapter.py#L21-L23, especially: convert_narrow_burst
It could be useful to study the code here: https://github.com/alexforencich/verilog-axi/blob/25912d48fec2abbf3565bbefe402c1cff99fe470/rtl/axi_adapter_rd.v
Same behavior. I have to improve information with:
litex> mem_write 0x80000000 0xdeadbeef
litex> mem_write 0x80000004 0xaabbccdd
litex> mem_write 0x80000008 0xcafebabe
litex> mem_write 0x8000000c 0x12345678
litex> mem_read 0x80000000
Memory dump:
0x80000000 ef be ad de ....
litex> mem_read 0x80000004
Memory dump:
0x80000004 dd cc bb aa ....
litex> mem_read 0x80000008
Memory dump:
0x80000008 ef be ad de ....
litex> mem_read 0x8000000c
Memory dump:
0x8000000c dd cc bb aa ....
litex> mem_read 0x80000008
Memory dump:
0x80000008 ef be ad de ....
litex> mem_read 0x8000000c
Memory dump:
0x8000000c dd cc bb aa ....
litex> mem_read 0x8000000c 32
Memory dump:
0x8000000c dd cc bb aa ef be ad de dd cc bb aa ef be ad de ................
0x8000001c dd cc bb aa 01 60 9b 6d 03 b0 ed b6 01 60 9b 6d .....`.m.....`.m
It look like a mask/shift issue when addressing data > 64b in a 256b area. 0x0 or 0x08 have the same value, and 0x4 or 0xc have the same value (but r.data is correctly filled).
I have already started to study code and with this behavior it's maybe more easy to focus on some few parts.
Note, looking at the lpddr latency for the read traces, it seems to be ~60 cycles, which at 100 Mhz mean 600 ns => that is a lot. For reference, on Arty A7 @ 100 Mhz (with litedram ctrl @ 100 Mhz + 800 Mtransfer DDR3) it is between 15-25 cycles latency.
At which frequancy the lpddr controler is running ? Maybe just just got unlucky with your trace, and you got it just at the moment it was doing a DDR refresh XD
@trabucayre: If you think this could be an issue in the Verilog AXI Adapter, it could also be worth doing 32 <-> 64 <-> 128 <-> 256 adaptations and see if it behaves differently. (So with 3 adapters).
@enjoy-digital I will test. This screenshot shows simulation with an hardcoded data and signals used to decode and shift data
By reading axi_adapter_rd.v:
>>
) by add_reg[4:2]*32
bit.With a case statement to manually select a word according to addr_reg slice sim shows the correct sequence. Maybe something wrong with >>
operator.
Edit: This fix works with agilex5 too.
With manual decoding or by cascading AXIInterface:
@trabucayre: Great, thanks! Now that we have a first version working refining will be easier. That's possible verilog_axi modules are less tested on Intel FPGA than on Xilinx FPGA.
@trabucayre: Please also commit the modified adapter in case it could be useful later.
Thanks, BTW, if this work, latency will probably be reduced with it, so we can maybe switch to it for the Linux bitstreams.
verilog_axi_rd_decode_32b_256b.patch
seems not have effects to the memspeed.
This is now working, we can close.
IDLE (arvalid always low)
lpddr_idle.vcd.zip
Start read (arvalid high during one clock cycle)
lpddr_start_read.vcd.zip
read end (rvalid goes low, rlast goest high)
lpddr_last_valid_8192.vcd.zip