ikwzm / udmabuf

User space mappable dma buffer device driver for Linux.
BSD 2-Clause "Simplified" License
539 stars 165 forks source link

Strange memory errors #77

Closed Anthonywolfe closed 3 years ago

Anthonywolfe commented 3 years ago

Hi,

I've setup the u-dma-buf to alloc 2 buffers on a Zybo-Z7-20 using petalinux and the device tree. I've expanded the CMA to 128MB (cma: Reserved 128 MiB at 0x38000000) My device tree configuration is

/ {
    model = "Zynq Zybo Z7 Development Board";
    compatible = "digilent,zynq-zyboz7", "xlnx,zynq-7000";

    chosen {
        bootargs = "console=ttyPS0,115200 earlyprintk CMA=128MB uio_pdrv_genirq.of_id=generic-uio root=/dev/mmcblk0p2 rw rootwait";
    };

        uio_axi_register@43C00000 {
              compatible = "generic-uio";
              reg = < 0x43C00000 0x1000 >;
        };

        uio_axi_dma@43C10000 {
              compatible = "generic-uio";
              reg = < 0x43C10000 0x1000 >;
        };

};

&amba {
    udmabuf@0x00 {
        compatible = "ikwzm,u-dma-buf";
        device-name = "udmabuf_rx";
        size = <0x3D09000>;
        sync-size = <0x3D09000>;
        sync-direction = <1>;
        sync-mode = <2>;
        dma-mask = <64>;
    };

    udmabuf@0x01 {
        compatible = "ikwzm,u-dma-buf";
        device-name = "udmabuf_tx";
        size = <0xF42400>;
        sync-size = <0xF42400>;
        sync-direction = <2>;
        sync-mode = <2>;
        dma-mask = <64>;
    };
};

Which appears to work fine as my bootlog reports

uio_pdrv_genirq 43c00000.uio_axi_register: IRQ index 0 not found
uio_pdrv_genirq 43c10000.uio_axi_dma: IRQ index 0 not found
u_dma_buf: loading out-of-tree module taints kernel.
u_dma_buf: loading out-of-tree module taints kernel.
u-dma-buf udmabuf_rx: driver version = 3.2.4
u-dma-buf udmabuf_rx: major number   = 244
u-dma-buf udmabuf_rx: minor number   = 0
u-dma-buf udmabuf_rx: phys address   = 0x38100000
u-dma-buf udmabuf_rx: buffer size    = 64000000
u-dma-buf amba:udmabuf@0x00: driver installed.
u-dma-buf udmabuf_tx: driver version = 3.2.4
u-dma-buf udmabuf_tx: major number   = 244
u-dma-buf udmabuf_tx: minor number   = 1
u-dma-buf udmabuf_tx: phys address   = 0x3bf00000
u-dma-buf udmabuf_tx: buffer size    = 16003072
u-dma-buf amba:udmabuf@0x01: driver installed.

I'm using mmap to map "/dev/udmabuf_rx" and "/dev/udmabuf_tx". However I've ran into a strange issue when reading and writing the buffers. Sometimes when reading/writing to the mmap I get a MemoryAccessViolation and we've captured some writes to /dev/udmabuf_tx ending up on our axi_dma_uio which is located at 0x43C10000. Which makes me think that the virtual reads/writes may not be reading/writing to the correct physical locations?

AXI_Capture Pictured is a data being written to the DMA registers captured from a ILA.

If you need any more info let me know and I'll try and provide it.

ikwzm commented 3 years ago

Thank you for the issue.

Is it possible to disclose the source code of the application where MemoryAccessViolation occurred?

Anthonywolfe commented 3 years ago

I've created a example application that experiences the same issues here: https://gist.github.com/Anthonywolfe/73da20856ec3faf83848383f3ee6448a

Some notes, we only see the Descriptors writing to the wrong location after the 128th, And the AccessViolationException is a bit random, sometimes it will happen on the first run, others the 5th. Sometimes it won't even happen at all.

ikwzm commented 3 years ago

please tell me.

What exactly is the value of the _MapSize variable in Init ()? Or what is the value of Environment.SystemPageSize? Are these values higher than the capacity required for DMA? Is it small?

Anthonywolfe commented 3 years ago

Oh wow, For reference Environment.SystemPageSize is a API around getpagesize so it returned 4096.

Originally when I wrote the MemoryMap abstraction I built it with structs I could sizeof to get their size. When I went to make use of the dma buffers I had just used a byte as the type which had a sizeof 1 and it aligned the size to the page which masked my issue as I could write some descriptors before I exceeded the memory I requested same goes for reading data back...

Thank you so much for your time and pointing out the issue.

ikwzm commented 3 years ago

I'm glad that I could help you out.