Xilinx / dma_ip_drivers

Xilinx QDMA IP Drivers
https://xilinx.github.io/dma_ip_drivers/
556 stars 413 forks source link

XDMA: End of packet has issues and needs more testing. #91

Open mpb27 opened 3 years ago

mpb27 commented 3 years ago

I ran a simple test on the new EOP code added by 2c01de2211bcf452e90cb65b2ee49fcaafda375c and the code fails with larger transfer sizes. Sometimes the reported number of transfered bytes is negative, other times the transfer is accidentally split in two.

Rather than posting the exact failures, I think it's probably best to just write a test bash script similar to the already present one but using the -e flag with longer transactions.

I've included a sample test script below. With sizes of 100kB+ the transfers fail quite often with too few bytes received, too many bytes received, or negative bytes received (very large value indicitive of overflow).

#!/bin/bash

tool_path=../tools
transferSize=${1:-4096}
transferMax=${2:-8192}
transferCount=${3:-1}

# Read max transfer size.
$tool_path/dma_from_device -v -e -d /dev/xdma0_c2h_0 -f data/output_temp.bin -s $transferMax -c $transferCount &

sleep 0.1

# Write smaller transfers.
$tool_path/dma_to_device -v -d /dev/xdma0_h2c_0 -f data/datafile_32M.bin -s $transferSize -c $transferCount &

wait

# Compare the last transfer.
cmp data/output_temp.bin data/datafile_32M.bin -n $transferSize

if [ ! $? == 0 ]; then
    echo "Error: The data written does not match the data that was read."
else
    echo "Info: Data check passed."
fi
mabl commented 2 weeks ago

I believe the problem is caused by transfers that are split up. In that case, all split up transfers get the EOP flag set - not only the very last one. I found the following patch to work well:

diff --git a/XDMA/linux-kernel/xdma/libxdma.c b/XDMA/linux-kernel/xdma/libxdma.c
index 1b8ec8aa391ba5d7d7d90b9c53a3ed45e4729c66..ad7a7663dae1a6edce67f11b7371bd373e3ece57 100644
--- a/XDMA/linux-kernel/xdma/libxdma.c
+++ b/XDMA/linux-kernel/xdma/libxdma.c
@@ -2961,6 +2961,7 @@ static int transfer_init(struct xdma_engine *engine,
    unsigned int desc_max = min_t(unsigned int,
                req->sw_desc_cnt - req->sw_desc_idx,
                engine->desc_max);
+   bool is_last_transfer = 0;
    int i = 0;
    int last = 0;
    u32 control;
@@ -2988,10 +2989,11 @@ static int transfer_init(struct xdma_engine *engine,
    xfer->desc_index = engine->desc_idx;

    /* Need to handle desc_used >= engine->desc_max */
-
    if ((engine->desc_idx + desc_max) >= engine->desc_max)
        desc_max = engine->desc_max - engine->desc_idx;

+    is_last_transfer = req->sw_desc_cnt - req->sw_desc_idx <= desc_max;
+
    transfer_desc_init(xfer, desc_max);

    dbg_sg("xfer= %p transfer->desc_bus = 0x%llx.\n",
@@ -3004,7 +3006,12 @@ static int transfer_init(struct xdma_engine *engine,
    last = desc_max - 1;
    /* stop engine, EOP for AXI ST, req IRQ on last descriptor */
    control = XDMA_DESC_STOPPED;
-   control |= XDMA_DESC_EOP;
+   if (is_last_transfer) {
+       dbg_sg("This is the last transfer.\n");
+       control |= XDMA_DESC_EOP;
+   } else {
+       dbg_sg("This is NOT the last transfer.\n");
+   }
    control |= XDMA_DESC_COMPLETED;
    xdma_desc_control_set(xfer->desc_virt + last, control);

@@ -3568,8 +3575,8 @@ ssize_t xdma_xfer_submit(void *dev_hndl, int channel, bool write, u64 ep_addr,
            xfer->sgt = sgt;
        }

-       dbg_tfr("xfer, %u, ep 0x%llx, done %lu, sg %u/%u.\n", xfer->len,
-           req->ep_addr, done, req->sw_desc_idx, req->sw_desc_cnt);
+       dbg_tfr("xfer, %u, ep 0x%llx, done %lu, sg %u/%u, nents %u.\n", xfer->len,
+           req->ep_addr, done, req->sw_desc_idx, req->sw_desc_cnt, nents);

 #ifdef __LIBXDMA_DEBUG__
        transfer_dump(xfer);