ikwzm / udmabuf

User space mappable dma buffer device driver for Linux.
BSD 2-Clause "Simplified" License
545 stars 165 forks source link

zerocopy network TCP send with u-dma-buf #121

Open joshuafc opened 2 weeks ago

joshuafc commented 2 weeks ago

The MSG_ZEROCOPY flag offers a compelling solution for bypassing unnecessary memory copies during socket send calls. This feature is currently supported for TCP, UDP, and VSOCK (with virtio transport) sockets (see: https://www.kernel.org/doc/html/v6.10/networking/msg_zerocopy.html).

However, attempting to utilize MSG_ZEROCOPY with a buffer mapped using u-dma-buf results in a "Bad address" error. This limitation arises because u-dma-buf memory (marked with the VM_PFNMAP flag) lacks the essential metadata required by the kernel for zero-copy operations. This metadata typically takes the form of a struct page* associated with each page in the buffer (as discussed here: https://stackoverflow.com/questions/58627200/zero-copy-user-space-tcp-send-of-dma-mmap-coherent-mapped-memory).

Therefore, the question remains: Can we achieve efficient network transmission of DMA-written data without resorting to an extraneous memory copy?

ikwzm commented 2 weeks ago

Thanks for the issue.

I assume this issue is probably related to #117 as well. As I answered in #117, this issue is beyond my knowledge. I don't have much knowledge of Linux Kernel VM.

I would like to leave this issue open for now.

joshuafc commented 2 weeks ago

We tested the technical approach outlined in #117 ( https://stackoverflow.com/questions/43503747/write-from-mmapped-buffer-to-o-direct-output-file/73032605#73032605 ) using dma_alloc_contiguous and vm_insert_pages.

The TCP socket now accepts mmaped buffers for zero-copy.

However, the code is not robust and has some differences with the current implementation logic, so I will not submit a pull request for now.

ikwzm commented 2 weeks ago

I created quirk-mmap-page mode as a trial. It is still under development and is located in a special branch (quirk-mmap-page-develop branch). I would like to officially release the product after completing the operation check, but this is not a definite matter.

shell$ git clone --depth 1 --branch quirk-mmap-page-develop https://github.com/ikwzm/udmabuf.git u-dma-buf-quirk-mmap-page
shell$ cd u-dma-buf-quirk-mmap-page

For usage, specify the module parameter quirk-mmap-mode=4 when insmod u-dma-buf.ko, device tree with the quirk-mmap-page property.

shell$ sudo insmod u-dma-buf.ko udmabuf0=0x10000 info_enable=7 quirk-mmap-mode=4
[11149.924485] u-dma-buf: DEVICE_MAX_NUM=256,UDMABUF_DEBUG=1,USE_QUIRK_MMAP=1,USE_QUIRK_MMAP_PAGE=1,IS_DMA_COHERENT=1,USE_DEV_GROUPS=1,USE_OF_RESERVED_MEM=1,USE_OF_DMA_CONFIG=1,USE_DEV_PROPERTY=1,IN_KERNEL_FUNCTIONS=1
[11149.929935] u-dma-buf udmabuf0: driver version = 4.6.0-RC2
[11149.929955] u-dma-buf udmabuf0: major number   = 507
[11149.929960] u-dma-buf udmabuf0: minor number   = 0
[11149.929966] u-dma-buf udmabuf0: phys address   = 0x0000000041740000
[11149.929972] u-dma-buf udmabuf0: buffer size    = 65536
[11149.929979] u-dma-buf udmabuf0: dma device     = u-dma-buf.0
[11149.929984] u-dma-buf udmabuf0: dma bus        = platform
[11149.929989] u-dma-buf udmabuf0: dma coherent   = 0
[11149.929995] u-dma-buf udmabuf0: dma mask       = 0x00000000ffffffff
[11149.930001] u-dma-buf udmabuf0: iommu domain   = NONE
[11149.930006] u-dma-buf udmabuf0: quirk mmap     = 1
[11149.930011] u-dma-buf udmabuf0: pages          = 0xffffff88032c3980
[11149.930017] u-dma-buf udmabuf0: pages[0]       = 0xfffffffe0105d000
[11149.930023] u-dma-buf u-dma-buf.0: driver installed.

device tree example

                   udmabuf0 {
                                compatible  = "ikwzm,u-dma-buf";
                                device-name = "udmabuf0";
                                size = <0x00100000>;
                                dma-coherent;
                                quirk-mmap-page;
                    };