bperez77 / xilinx_axidma

A zero-copy Linux driver and a userspace interface library for Xilinx's AXI DMA and VDMA IP blocks. These serve as bridges for communication between the processing system and FPGA programmable logic fabric, through one of the DMA ports on the Zynq processing system. Distributed under the MIT License.
MIT License
464 stars 227 forks source link

DMA receive transaction timed out #20

Closed yaobaishen closed 7 years ago

yaobaishen commented 7 years ago

I have a block design with AXI_DMA IP included, and I have verify it with bare metal system. Then I follow the README to build the driver/device tree/app. The "axidma.ko" can be inserted successfully, but the example apps all fail with " DMA receive transaction timed out ", Does anybody run into this problem before?

Here is my device tree:

/ { amba_pl: amba_pl {

address-cells = <1>;

    #size-cells = <1>;
    compatible = "simple-bus";
    ranges ;
    axidma_chrdev: axidma_chrdev@0 {
        compatible = "xlnx,axidma-chrdev";
        dmas = <&axi_dma_0 0 &axi_dma_0 1>;
        dma-names = "tx_channel", "rx_channel";
    };
    axidmatest_0: axidmatest@0{
        compatible = "xlnx,axi-dma-test-1.00.a";
        dmas = <&axi_dma_0 0 &axi_dma_0 1>;
        dma-names = "axidma0", "axidma1";
    };
    axi_dma_0: dma@40400000 {
        #dma-cells = <1>;
        compatible = "xlnx,axi-dma-1.00.a";
        interrupt-parent = <&intc>;
        interrupts = <0 29 4 0 30 4>;
        reg = <0x40400000 0x10000>;
        dma-mm2s-channel@40400000 {
            compatible = "xlnx,axi-dma-mm2s-channel";
            dma-channels = <1>;
            interrupts = <0 29 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x0>;
        };
        dma-s2mm-channel@40400030 {
            compatible = "xlnx,axi-dma-s2mm-channel";
            dma-channels = <1>;
            interrupts = <0 30 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x1>;
        };
    };
    axi_gpio_0: gpio@41200000 {
        #gpio-cells = <2>;
        compatible = "xlnx,xps-gpio-1.00.a";
        gpio-controller ;
        reg = <0x41200000 0x10000>;
        xlnx,all-inputs = <0x0>;
        xlnx,all-inputs-2 = <0x0>;
        xlnx,all-outputs = <0x0>;
        xlnx,all-outputs-2 = <0x0>;
        xlnx,dout-default = <0x00000000>;
        xlnx,dout-default-2 = <0x00000000>;
        xlnx,gpio-width = <0x1>;
        xlnx,gpio2-width = <0x20>;
        xlnx,interrupt-present = <0x0>;
        xlnx,is-dual = <0x0>;
        xlnx,tri-default = <0xFFFFFFFF>;
        xlnx,tri-default-2 = <0xFFFFFFFF>;
    };
};

};

and here is what I saw in Linux command line.

root@zedboard-zynq7:~# cd xilinx_axidma/ root@zedboard-zynq7:~/xilinx_axidma# ls axidma.ko axidma_display_image axidma_transfer axidma_benchmark axidma_plps root@zedboard-zynq7:~/xilinx_axidma# insmod axidma.ko axidma: axidma_dma.c: axidma_dma_init: 705: DMA: Found 1 transmit channels and 1 receive channels. axidma: axidma_dma.c: axidma_dma_init: 707: VDMA: Found 0 transmit channels and 0 receive channels. root@zedboard-zynq7:~/xilinx_axidma# ./axidma_benchmark AXI DMA Benchmark Parameters: Transmit Buffer Size: 7.91 Mb Receive Buffer Size: 7.91 Mb Number of DMA Transfers: 1000 transfers

Using transmit channel 0 and receive channel 1. axidma: axidma_dma.c: axidma_start_transfer: 298: DMA receive transaction timed out. xilinx-dma 40400000.dma: Cannot stop channel df5f81d0: 0 Failed to perform the AXI DMA read-write transfer: Timer expired

yaobaishen commented 7 years ago

BTW, axidma_transfer also fail with “DMA receive transaction timed out.”

AXI DMA File Transfer Info: Transmit Channel: 0 Receive Channel: 1 Input File Size: 0.03 Mb Output File Size: 0.03 Mb

axidma: axidma_dma.c: axidma_start_transfer: 298: DMA receive transaction timed out. Failed to perform the AXI DMA read-write transfer: Timer expired DMA read write transaction failed.

bperez77 commented 7 years ago

Can I see your block design? A timeout error typically indicates that the interrupts from the AXI DMA IP are not correctly hooked up.

yaobaishen commented 7 years ago

system.pdf

yaobaishen commented 7 years ago

Sure. I have attached my block design. I am also pleasure to provide the entire vivado project, it's about 100MByte and I don't know where to upload it through.

yaobaishen commented 7 years ago

I have uploaded the full vivado project to dropbox, wish it would facilitate finding the root cause, thanks.

https://www.dropbox.com/s/0mx77z54e4tci8v/CH05_AXI_DMA_MT9V034_HDMI_CEP2.zip?dl=0

bperez77 commented 7 years ago

Is the xlconstant_0 block you have connected to the tkeep signal of your AXI DMA a 0 value? If that's the case, that is likely the issue. Not sure how familiar you with the AXIS-4 protocol, but the tkeep signal is used to indicate which bytes of the tdata signal are valid (or which ones to "keep"). Unless you have an unaligned transfer, this value should always be all 1's. Additionally, the only time when tkeep can be anything other than all 1's is when tlast is asserted. Anything else will cause, at least in my experience, the AXIS-4 protocol FSM to hang.

bperez77 commented 7 years ago

Scratch that, I just checked in your design, and it's the correct value.

Nothing seems incorrect in your block design as far as I can tell. The timeout error typically indicates an error somewhere in the design; it simply means that the tlast signal is not being asserted in the receive (S2MM) side of the DMA. Have you been able to verify the design independently from my driver? I suspect that the issue likely lies in there.

bperez77 commented 7 years ago

One other odd thing I noticed is that the tlast port of the v_axi4s_vid_out_0 module is connected to the tlast signal of v_axi4s_vid_in_0. The transmit (MM2S) side of the AXI DMA has its tlast port dangling, but I think that this should actually be connected to the v_axi4s_vid_out_0 module's.

yaobaishen commented 7 years ago

Thanks for looking into the block design. Yes, I have verified the design with bare metal, no Linux running on PS, and the DMA works well. As you said, the tlast of v_axi4s_vid_out_0 should connect to MM2S of AXI DMA, but since the v_axi4s_vid_out_0 IP is used to connect to external HDMI display port, I think it's irrelevant to the driver issue ? The block design I uploaded is somehow complicated, to make it easier, I also try the driver with a DMA_LOOP design, which is followed by this tutorial: http://www.fpgadeveloper.com/2014/08/using-the-axi-dma-in-vivado.html, but the "DMA receive transaction timed out" issue still exists.

yaobaishen commented 7 years ago

Here is the block design of the AXI DMA LOOP project

system.pdf

bperez77 commented 7 years ago

Yeah you're right, that shouldn't affect the S2MM DMA transaction anyway. This is new, usually if it works in the bare metal mode, it works with the driver. When you wrote the bare-metal code, did you do polling or interrupts for the DMA?

That loopback design looks fine as well, not sure why you'd get a timeout. I also noticed that you had an ILA core connected to the interrupt lines? Did you see those lines being asserted?

Just for completeness, which version of Vivado and Linux are you using? For Linux, I'm looking for the repository and exact commit that you're on.

yaobaishen commented 7 years ago

I am using Vivado 2015.4. I checkout xilinx linux by tag: xilinx-v2015.4, the uboot is checked out by tag: xilinx-v2015.4 too. The two block designs I have uploaded all work well with bare metal mode, as you can see the mm2s_introut and s2mm_introut of AXI DMA IP are used, I look into the bare-metal code and verify that the interrupt mode is used. Could you share us which vivado and linux kernel commit are you using?

xuchendev commented 7 years ago

i meet the same problem, my vivado IDE is 2016.4,and the linux version is branch 2016.4 which is 4.6.0-xilinx

yaobaishen commented 7 years ago

Though I still don't know the root cause, after switch my kernel/uboot/device-tree to v2016.3, the examples can be run successfully now. I still use vivado v2015.4, so looks like this issue is irrelevant to vivado, but the kernel version. Great thanks, wish this test could help others who run different xilinx kernel versions.

root@zedboard-zynq7:~/xilinx_axidma# ./axidma_benchmark AXI DMA Benchmark Parameters: Transmit Buffer Size: 7.91 Mb Receive Buffer Size: 7.91 Mb Number of DMA Transfers: 1000 transfers

Using transmit channel 0 and receive channel 1. Warning: 99.95% of the receive buffer matches the initialization pattern. This may mean that the receive buffer was not properly updated. Single transfer test successfully completed! Beginning performance analysis of the DMA engine.

DMA Timing Statistics: Elapsed Time: 0.08 s Transmit Throughput: 104316.99 Mb/s Receive Throughput: 104316.99 Mb/s Total Throughput: 208633.97 Mb/s

bperez77 commented 7 years ago

The driver should work on any 4.x version of Linux, and others have used the v2016.4 branch and it successfully worked for them. It looks like that while you're not getting a timeout error, the results you're getting are definitely not valid. Most of the receive buffer was not updated, which means that your S2MM transaction stopped early for some reason. That's also why you have the ridiculous throughput that is reported by the script.

If you check your dmesg output, I bet there's a message about an error with the DMA channel.

yaobaishen commented 7 years ago

Sorry that I don't look at the logs carefully yesterday, yes, the S2MM still has problem. here is the dmesg log, I don't find an DMA error.

dmesg.txt

I think the root cause maybe my block design, or, the hardware, so the DMA interrupt doesn't work as expected. But I still don't know why the bare metal mode works well.

yaobaishen commented 7 years ago

Hi Brandon,

I read the issue #16 carefully and find that the error log has been run into before:

Warning: 99.95% of the receive buffer matches the initialization pattern. This may mean that the receive buffer was not properly updated.

And I am using zedboard too, so maybe this issue is hardware related? I have updated in #16 too.

raffaelesury commented 7 years ago

I get that warning and the ridicolous speeds too but I tried the axidma_transfer example app with an IP that generates data I know (I just edited the template code generated when you generate AXI stream IP from the create and edit new ip package tool so that it outputted a 4 byte ascii word for a number of times that I set with the slide switches) and it was transferring the data I expected the number of times I expected (with the only issue being the whole thing hanging when the IP stayed silent because it was not reset but this shouldn't be relevant to running the benchmark app with a looped back dma core) so I ignored the benchmark app.

yaobaishen commented 7 years ago

After several days investigation, I think it's my fault to use this driver with my customize PL design without any modification. This driver could work with a DMA_LOOP design, but for me, the DMA src_addr, dst_addr, length, mode and maybe other settings should be modified to work with my PL, that's why I always see "trans timeout" or "the buffer mot updated", which means the AXI_DMA IP just not work. I will try to modify this driver referring to the baremetal code, and I will update here if I have any progress.

yaobaishen commented 7 years ago

Hi Brandon,

I still have problem with xilinx_axidma driver, but some clues can be found: 1) xilinx_axidma driver works fine with a DMA_LOOP PL design, I only check with axidma_transfer, and a file can be successfully transferred and received. Other test examples still have problem but I ignore them; 2) xilinx_axidma driver doesn't work with my PL design, which I have uploaded at the beginning of this thread. I have checked the DMA registers, including the Control register, Status register, dst addr and trans length registers, and I don't find any difference with 1). 3) I have confirmed that the root cause of 2) is the DMA transfer never completes, so there is no IOC interrupt signal, the interrupt callback won't be called and the status register never indicates the transfer completes. maybe the registers should be configured differently with my PL design? or something should be done in driver side when the PL changes from the DMA_LOOP to my PL design? I am quite sure that zynq linux driver should be modified per PL settings, but I didn't expected that this is so hard.....

besides, attach my latest device tree.

/ { axidma_chrdev: axidma_chrdev@0 { compatible = "xlnx,axidma-chrdev"; dmas = <&axi_dma_0 0 &axi_dma_0 1>; dma-names = "tx_channel", "rx_channel"; }; axidmatest_0: axidmatest@0{ compatible = "xlnx,axi-dma-test-1.00.a"; dmas = <&axi_dma_0 0 &axi_dma_0 1>; dma-names = "axidma0", "axidma1"; }; amba_pl: amba_pl {

address-cells = <1>;

    #size-cells = <1>;
    compatible = "simple-bus";
    ranges ;
    SSD1306_OLED_ML_0: SSD1306_OLED_ML@43c00000 {
        compatible = "xlnx,SSD1306-OLED-ML-1.0";
        reg = <0x43c00000 0x10000>;
        xlnx,s00-axi-addr-width = <0x7>;
        xlnx,s00-axi-data-width = <0x20>;
    };
    axi_dma_0: dma@40400000 {
        #dma-cells = <1>;
        clock-names = "s_axi_lite_aclk", "m_axi_sg_aclk", "m_axi_mm2s_aclk", "m_axi_s2mm_aclk";
        clocks = <&clkc 15>, <&clkc 15>, <&clkc 15>, <&clkc 15>;
        compatible = "xlnx,axi-dma-1.00.a";
        interrupt-parent = <&intc>;
        interrupts = <0 29 4 0 30 4>;
        reg = <0x40400000 0x10000>;
        xlnx,addrwidth = <0x20>;
        dma-mm2s-channel@40400000 {
            compatible = "xlnx,axi-dma-mm2s-channel";
            dma-channels = <0x1>;
            interrupts = <0 29 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x0>;
        };
        dma-s2mm-channel@40400030 {
            compatible = "xlnx,axi-dma-s2mm-channel";
            dma-channels = <0x1>;
            interrupts = <0 30 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x1>;
        };
    };
};

};

yaobaishen commented 7 years ago

Close the issue now, the xilinx_axidma driver works well, it's developer's work, just like me, to customize the driver per PL design. A suggestion is to modify this driver a little further to support multiple dma channels, e.g. 2 rx and 2 tx channel. Thanks bperez77 again!

bperez77 commented 7 years ago

Sorry for the late reply, haven't been active. Yeah unfortunately, MDMA is not supported by the driver at the moment, though it's on the to-do list. Did that end up being the issue?

So I guess a little clarification about my driver. The driver doesn't actually perform any of the MMIO writes to the IP registers. Throughout the documentation I often refer to a "backend" driver. This is the that is actively maintained by Xilinx (you can find it here). That driver actually performs all the low-level operations with the hardware (MMIO, interrupt handling, etc.).

Unfortunately, the way they wrote their driver only allows other drivers, or kernel-space code to use its functionality. There is no support for a userspace application to directly use the interface. This is where my driver comes in. It provides a userspace interface (with a library sitting on top of that) to allow applications to use the backend AXI DMA driver.

When I first wrote the driver, MDMA wasn't supported by the Xilinx backend driver, which is why I didn't bother including it.

yaobaishen commented 7 years ago

The root cause of my issue is my PL design has problem, there isn't data sent to AXI_DMA IP so there is no complete interrupt. I am asking MDMA because there are actually two AXI_DMA IP in my PL design, and I have to use two drivers to operate them separately.

bperez77 commented 7 years ago

I see, when I'm referring to MDMA, I'm referring to multi-channel DMA, which is when 2 or more IPs share a single AXI DMA IP. The case you're talking about is supported by the driver, but you have to make sure to setup the device tree entry for the second AXI DMA correctly.

It needs to have a unique channel ID, and the correct reg and interrupts properties. I should be able to help if you show me the device tree entries you created for the original design.

yaobaishen commented 7 years ago

Not active on linux driver these days. here is my device tree for two AXI_DMA IP:

    axi_dma_0: dma@40400000 {
        #dma-cells = <1>;
        clock-names = "s_axi_lite_aclk", "m_axi_sg_aclk", "m_axi_mm2s_aclk", "m_axi_s2mm_aclk";
        clocks = <&clkc 15>, <&clkc 15>, <&clkc 15>, <&clkc 15>;
        compatible = "xlnx,axi-dma-1.00.a";
        interrupt-parent = <&intc>;
        interrupts = <0 29 4 0 30 4>;
        reg = <0x40400000 0x10000>;
        xlnx,addrwidth = <0x20>;
        dma-mm2s-channel@40400000 {
            compatible = "xlnx,axi-dma-mm2s-channel";
            dma-channels = <0x1>;
            interrupts = <0 29 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x0>;
        };
        dma-s2mm-channel@40400030 {
            compatible = "xlnx,axi-dma-s2mm-channel";
            dma-channels = <0x1>;
            interrupts = <0 30 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x1>;
        };
    };
    axi_dma_1: dma@40410000 {
        #dma-cells = <1>;
        clock-names = "s_axi_lite_aclk", "m_axi_sg_aclk", "m_axi_mm2s_aclk", "m_axi_s2mm_aclk";
        clocks = <&clkc 15>, <&clkc 15>, <&clkc 15>, <&clkc 15>;
        compatible = "xlnx,axi-dma-1.00.a";
        interrupt-parent = <&intc>;
        interrupts = <0 31 4 0 32 4>;
        reg = <0x40410000 0x10000>;
        xlnx,addrwidth = <0x20>;
        dma-mm2s-channel@40410000 {
            compatible = "xlnx,axi-dma-mm2s-channel";
            dma-channels = <0x1>;
            interrupts = <0 31 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x0>;
        };
        dma-s2mm-channel@40410030 {
            compatible = "xlnx,axi-dma-s2mm-channel";
            dma-channels = <0x1>;
            interrupts = <0 32 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x1>;
        };
    };

As I misunderstood the MDMA, I use two AXI_DMA driver separately, it's not beautiful but could work........

bperez77 commented 6 years ago

Yeah I can see the error here. The device ID for each child node in the device tree needs to have a unique xlnx,device-id property. In theory, the driver should print an error to the kernel log buffer in this case, but it looks like it missed it. You can change the IDs for the axi_dma_1 node to 2 and 3, and that should do the trick.

hawkroot-ua commented 6 years ago

I have the same error, and I use loop design.

root@arm:/home/xilinx_axidma/examples# ./axidma_transfer 1.txt 2.txt AXI DMA File Transfer Info: Transmit Channel: 0 Receive Channelxilinx-vdma 40400000.dma: Channel 9da96710 has errors 400, cdr 2e060080 tdr 2e060080 : 1 Input File Size: 0.00 Mb Output File Size: 0.00 Mb

axidma: axidma_dma.c: axidma_start_transfer: 298: DMA receive transaction timed out. Failed to perform the AXI DMA read-write transfer: Timer expired DMA read write transaction failed.

zhangzilin commented 5 years ago

The root cause of my issue is my PL design has problem, there isn't data sent to AXI_DMA IP so there is no complete interrupt. I am asking MDMA because there are actually two AXI_DMA IP in my PL design, and I have to use two drivers to operate them separately.

Can you share some of the mistakes in your PL design? Because I also encountered the same problem as you, and we used a similar PL design.

Thank you very much.

maikonadams commented 5 years ago

Hi , I also have an IP block between S_AXIS_S2MM and S_AXIS_MM2S. When I try to run the axidma_benchmark I get : maikon AXI DMA Benchmark Parameters: Transmit Buffer Size: 7.91 MiB Receive Buffer Size: 7.91 MiB Number of DMA Transfers: 1000 transfers

Using transmit channel 0 and receive channel 1. Failed to perform the AXI DMA read-write transfer: Device or resource busy

dmesg is : cma: cma_alloc(cma c184d420, count 2025, align 8) cma: cma_alloc(): returned effc1000 cma: cma_alloc(cma c184d420, count 2025, align 8) cma: cma_alloc(): returned effd1000 axidma: axidma_dma.c: axidma_prep_transfer: 236: Unable to prepare the dma engine for the DMA transmit buffer. cma: cma_release(page effd1000) cma: cma_release(page effc1000)

I am probing the AXIS bus with System ILA and I do not any transfer.