Closed mohammadhgh closed 2 years ago
Hello!
Thanks a lot for a detailed report. I will try to investigate this issue soon. Sorry, I haven't tested 16-bit and 64-bit mode enough. Hope that 32-bit AXI bus width mode works stable for you. Stay sure that AXI 32-bit mode will not limit memory throughput according with you system specs:
(AXI width x AXI clock) >= (HBMC width x HBMC clock x DDR)
32bit x 100MHz >= 8bit x 100MHz x 2
Stay sure that AXI 32-bit mode will not limit memory throughput according with you system specs:
(AXI width x AXI clock) >= (HBMC width x HBMC clock x DDR) 32bit x 100MHz >= 8bit x 100MHz x 2
Thank you very much for your reply. You are write, I was missing a very important point. I changed the clock of the AXI port of OpenHBMC to 50 MHz and every thing seems to be OK. I will change the Hyper RAM part to get higher frequencies in my board's future versions.
@OVGN Hello,
Last time I changed the Microblazr and OpenHBMC clock to 50 MHz and the memory test program was OK. However, my own application still has problem. Microblaze stall's on a specific instruction. I checked the OpenHBMC AXI port and saw that there is a write transaction that doesn't complete ever. Here is the ILA data for OpenHBMC AXI port:
As you can see, AWVALID
is asserted but AWREADY
stays zero forever. This causes an overflow to happen in data cache port.
I was using OpenHBMC v1.1 with my application successfully for several months (I am not sure if I had a successful run with v2.0 or not) so I will try to find out what has changed and caused this problem.
Hello, @mohammadhgh
This is quite strange. Such kind of stalls should never happen. I need more details to figure out what happend. Your screenshot is good, but not enough. Could you, please, catch this issue on ILA again and export .vcd file? I will be able to see and analyze all signals on your waveform.
At least I see one strange thing on your waveforms:
Please, upload VCD file.
@OVGN
Hi,
These are two ILA files for the picture I uploaded previously. One is for for the AXI Interconnect Slave side which is connected to Microblaze I-Cache and D-Cache an also Microblaze Trace port and the other ILA file is for AXI Interconnect Master side that is connected to OpenHBMC.
Hi,
Thanks a lot for ILA files. I have found the issue. This is very stupid bug. I will try to fix it today.
Hello,
I have fixed the AXI stuck issue. Please update you IP and let me know if it works for you.
Concerning to 16-bit and 64-bit mode, they are still not tested yet, going to do this in a few day, so please continue using AXI 32-bit data width mode.
@OVGN
Hello,
Thank you very much for the changes. I tested the new OpenHBMC in new test Vivado and Vitis projects which have only Microblaze subsystems and OpenHBMC. In this project, the Vitis's memory test program passes and there is no problem.
However, I still have memory issues in my main application project. For some specific cache sizes or cache line lengths, the memory test program fails. I checked the AXI port of OpenHBMC to see what is the problem and this the result. This is the first word write process in Vitis's memory test program which writes 0x00000001
to address 0x000000
and is done successfully.
But the first word read back fails as the data read back is 0x00008001
:
I increased Microblaze cache size and cache line length and this time memory test program passes. This the ILA result:
The only difference that I see in this two situations is that the AXI ID width is changed from 3 to 1. Do you have any idea about this?
Also because I am using Vivado 2021.2, I upgraded the fifo IP cores inside the OpenHBMC but it didn't change the result. iladata.zip is also attached.
Hi, @mohammadhgh
Many thanks again for detailed reports.
I have analyzed the diagrams and made different tests with various cache line length and size. Finally, I could catch wrong read failure. In fact there is no relation between cache line length, cache size and incorrect data at read. This is just some specific design placement, that causes error. This is quite unexpected... Nevertheless, I have fixed IP core primitives location to stabilly reproduce this bug, having some internal signals connected to ILA. Investigations are in progress. Hope to find the root of the problem tomorrow.
Hi, @mohammadhgh
I have some results. Looks like this is not design logic issue. In fact this is timing issue.
I was running tests on mb_dual_ram
design project. I noticed, that I forgot to declare system clock frequency constraint.
There were a lot of timing warnings like _no_clock, unconstrained_internal_endpoints, no_input_delay, no_outputdelay.
Most of the violations can be ignored, as this is CDC synchronizers or input/output delay warnings, that are already resolved by design. But I decided to add a constraints file for IP to resolve all warnings.
Any other timing violations are critical and must be fixed by IP user. As you are using ILA, I strongly recommend to set for all your ILA cores this option: Input Pipe Stages = 1
. It will not affect captured ILA data, but will help a lot to relax timings.
In general, I think that your design fails due to timing violations in IP core. Probably you haven't noticed them, as there are quite a lot of warning, because I hadn't added contraints file. I'm going to fix this very soon, probably tomorrow.
Hi @OVGN
Thank you very much for your work and fast replies.
I have a compilation in which there is no problem. I fixed the placement of the OpenHBMC module and everything is good for now, so I think you are right and it seems like a timing issue.
I checked OpenHBMC synthesis reports and found there exist these critical warnings about FIFO IP cores inside OpenHBMC. Then I upgraded FIFO IP cores, but the warnings where still there so I just ignored theme!
[Designutils 20-1280] Could not find module 'fifo_18b_18b_512w'. The XDC file /..../fifo_18b_18b_512w.xdc will not be read for any cell of this module.
Also I remember that there where an XDC file in the OpenHBMC v1.1 which I noticed is omitted in the current version. With the v1.1, I had always a negetive slack around 0.7 ns with 100MHz Hyper Bus clock. The problem is that in our board, the RWDS is not connected to a clock compatible pin, and also the speed grade of the FPGA is -1. We are planning to change the board and in the new board we will connect the RWDS to an MRCC or SRCC pin.
Hello, @mohammadhgh
I checked OpenHBMC synthesis reports and found there exist these critical warnings about FIFO IP cores inside OpenHBMC. Then I upgraded FIFO IP cores, but the warnings where still there so I just ignored theme!
I don't know why, but Vivado cannot understand that fifo_18b_18b_512w
is not used due to selected parameter and there is no need to apply .xdc file of unused module. I'm going to replace Xilinx FIFO IP with custom ones to make design more flexible and remove these annoying errors. Yes, please, ignore these warning for a while.
Also I remember that there where an XDC file in the OpenHBMC v1.1 which I noticed is omitted in the current version. With the v1.1, I had always a negetive slack around 0.7 ns with 100MHz Hyper Bus clock.
Right, I removed that XDC file, as it was incorrect. Now I'm working at new one, adding all needed constraints for OpenHBMC and will return correct XDC file back. Just need a bit more time to finish this...
The problem is that in our board, the RWDS is not connected to a clock compatible pin, and also the speed grade of the FPGA is -1. We are planning to change the board and in the new board we will connect the RWDS to an MRCC or SRCC pin.
In common HyperBUS IP design RWDS is used to sample data bus. In this case, you are right, RWDS should be connected to a clock capable pin. As RWDS is guaranteed to be edge aligned to data, we should delay RWDS to shift it to the center of the data bit. The flaw of this scheme is that calibration procedure is needed to shift RWDS. Also for low frequencies, even 100MHz this is quite hard to delay RWDS by 5ns. Single IDELAY primitive can delay signal for 2.5ns max. In this case probably IDELAY cascading can help, no matter. Also theoretically HyperBUS tCKDS
and tCKD
timing values can vary with temperature of the memory part and probably periodic RWDS recalibration will be needed for reliable operation.
OpenHBMC data reception logic is designed in completely different way. RWDS is not used to sample data bus. RWDS is oversampled by x6 clock (x3 in DDR) along with data bus. There is no special IO placement requirements for RWDS, it can be connected to clock capable or common FPGA pin. After that oversampled data and RWDS goes to DRU (data recovery unit) that detects RWDS rising and falling edges and selects right data samples to recover data. There is no need to make any kind of calibration with this scheme. DRU FSM covers all possible conditions and always should be able to recover data.
In general, if you can connect RWDS to MRCC/SRCC - do it. This is probably will be mandatory for some HyperRAM memory controller IPs, but OpenHBMC doesn't need it at all:
Concerning performance, I'm using commercial lowest speed grade XC7S50-1CSGA324C with W956D8MBYA5I.
I have stable working project mb_single_ram
configured to run W956D8MBYA5I part at max possible frequency of 200MHz.
Hi, @mohammadhgh
I have released rev.83. Among other improvements, I have finally added constraints file, i.e. no more timing critical warnings.
Hi @OVGN
Sorry, I didn't have access to my system for a few weeks to test your new revision. Now I tested it and everything is OK. Thank you very much for your support.
@OVGN Hi, First I want to thank you for your great work and that you made it open source which is a very valuable.
I saw a problem in my system using openHBMC and found a workaround for it, so I decided to report it here for use of others facing it. Actually the problem is not still very clear to me and I don't exactly know where the cause of problem is, so I just report my observations without any conclusion.
I have three VDMAs and a Microblaze with I-Cache and D-Chache enabled and connected to openHBMC by AXI-Interconnect IP (Vivado's default suggestion is AXI-SmartConnect but it uses a lot of resources!). This is part of my system in Vivado:
Microblaze is configured with 8KB D-Cache and 8KB I-Cahce and each cache is configured with 16 Line Length parameter for better performance. When VDMAs
Memory Map Data Width
parameter is configured automatically which is 64 bits, memory test (template from Vitis with no changes) Fials. This failure is not always the same as sometimes it fails only for 32 bits test, sometimes for all tests and sometimes microblaze satlls. However, when I change the VDMAsMemory Map Data Width
parameter manulally and set it to 32 bits every thing is OK and memory test passes.My system spec is this: