Closed qiweiii-git closed 1 year ago
There are two main checks that need to be done when issuing AXI operations. First is to check the size of the operation vs. the max burst size, and to cut the transfer at the max burst size if it is larger than the max burst size. The second is to check for 4K address boundary crossings. Bursts cannot cross 4k address boundaries, so if there is a crossing then the burst has to be split on that crossing, and you can potentially get a 4k address boundary crossing even when transferring 2 data bytes. The thing to note is that any burst 4k or larger must necessarily cross a 4k boundary, which renders the burst size check redundant as the 4k boundary check will also limit the burst size to 4k. You're never going to have a situation where the max burst size check will fail but the 4k check will pass - either the 4k check fails and the transfer is cut on a 4k boundary, or they both fail and the transfer is cut on a 4k boundary, in both cases the result is the same. Hence this simple optimization to explicitly skip the burst size check if the max burst size is 4k or larger, which saves some logic resources and improves timing performance without affecting functionality.
@alexforencich Thank you so much for your reply. If we have a data stream large than 4k, so we need to cut the data stream into multiple 4k transfer before it goes to axi_dma? And the s_axis_write_desc_len should not be large than 4k?
The idea is you specify the actual length of the transfer, and the DMA engine will split it up internally so that the individual AXI operations comply with all of the rules, both the max burst size and the 4k alignment. So no, you should not have to cut up your data stream before handing it off to the DMA engine.
Although, it's possible there is a bug somewhere - are you seeing it hang with transfers larger than 4KB?
@alexforencich
Yes, I am using 128 data width and AXI_MAX_BURST_LEN set to 256. axi_dma connects to axi_ram. s_axis_write_desc_len is 16384 bytes. state_reg is hanging in STATE_WRITE.
If I comment ' || AXI_MAX_BURST_SIZE >= 4096' out , is looks good.
if (op_word_count_reg <= AXI_MAX_BURST_SIZE - (addr_reg & OFFSET_MASK)/* || AXI_MAX_BURST_SIZE >= 4096*/) begin
See, that would have been useful to mention before, I didn't realize you might have run in to some sort of a bug. I'll take a look at it, 128/8*256 = 4096, so there should be no need to check for the max burst size as the 4k check will also restrict the burst size to 4096, but apparently something is screwed up somewhere.
@alexforencich Thank you!
So, I just tested it with what I think are the same settings as you (data width 128, max burst 256, transfer size 16384), and I cannot reproduce the issue - it doesn't get stuck. I wonder if there is some difference in interpretation of the HDL between the simulators in use. What simulator are you using, and can you give me a full waveform dump (not a screenshot) of the problematic run?
@alexforencich As you mentioned that you can't reproduce the issue, I downloaded the the latest commit and checked it works fine. The commit what I was using is committed at Nov. 2019, it do have problem, but after the commit as below(committed at Nov. 2020), the issue is gone. Rewrite 4K address boundary crossing checks Thank you so much for your help.
Ah yeah, some of these checks were definitely wrong in the past, hence why it is vital to always use the latest version of the code. Anyway, nice to know it works!
axi_dma_wr lines 465
if (op_word_count_reg <= AXI_MAX_BURST_SIZE - (addr_reg & OFFSET_MASK) || AXI_MAX_BURST_SIZE >= 4096) begin
// packet smaller than max burst size
https://github.com/alexforencich/verilog-axi/blob/38915fb5330cb8270b454afc0140a94489dc56db/rtl/axi_dma_wr.v#LL465C26-L465C26If AXI_MAX_BURST_LEN set to 256 and AXI_DATA_WIDTH set to 128, the AXI_MAX_BURST_SIZE will be 4096, it will always go to the 'if' and assume the packet smaller than max burst size, if we send a data stream large than the max burst size, the axi_dma will hanging. Did I miss some limitation of the axi_dma?