Xilinx / dma_ip_drivers

Xilinx QDMA IP Drivers
https://xilinx.github.io/dma_ip_drivers/
562 stars 415 forks source link

xdma_xfer_submit_nowait doesn't support large transfers #201

Open niclashedam opened 1 year ago

niclashedam commented 1 year ago

When submitting an asynchronous transfer using xdma_xfer_submit_nowait, XDMA ends up in a bad state. This happens because XDMA partitions the DMA into multiple pieces when there are more descriptors than the max amount defined in XDMA_ENGINE_XFER_MAX_DESC.

The partition happens on line 3891: https://github.com/Xilinx/dma_ip_drivers/blob/master/XDMA/linux-kernel/xdma/libxdma.c#L3891

Now, strangely, the number of partitions is hard coded to 2: https://github.com/Xilinx/dma_ip_drivers/blob/master/XDMA/linux-kernel/xdma/libxdma.h#L480

When submitting a transfer larger than ~16 MB, bogus descriptors are forwarded to the engine because the descriptors are out of bounds, causing XDMA to hang. Changing the hardcoded number of partitions to more than two resolves the bogus descriptors but does not fix the underlying issue. The engine cannot handle that amount of descriptors, so the queue overflows. I cannot find the specific part of the code that causes this, but the engine hangs after processing the first two transfers.