fengggli / dpdk

Data Plane Development Kit
0 stars 0 forks source link

external memory registration #1

Open fengggli opened 5 years ago

fengggli commented 5 years ago

Tracks how to register external memory rte.

fengggli commented 5 years ago

dev_dma_map

https://github.com/fengggli/dpdk/blob/07efd6ddc0499688eb11ae4866d3532295d6db2b/lib/librte_eal/common/eal_common_dev.c#L760-L792

rte_extmem_register

https://github.com/fengggli/dpdk/blob/07efd6ddc0499688eb11ae4866d3532295d6db2b/lib/librte_eal/common/eal_common_memory.c#L796-L843

unit test

https://github.com/fengggli/dpdk/blob/07efd6ddc0499688eb11ae4866d3532295d6db2b/app/test/test_external_mem.c#L394-L404

fengggli commented 5 years ago

Test output (following the instructions)

(base) fengggli@ribbit5(:):~/WorkSpace/dpdk905/dpdk-19.05$sudo ./build/app/testpmd -- --mp-alloc xmem
EAL: Detected 48 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:06:00.1 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:06:00.2 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:06:00.3 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
testpmd: No probed ethernet devices
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=523456, size=2176, socket=0
testpmd: Allocated 2172MB of external memory
testpmd: preferred mempool ops selected: ring_mp_mc
testpmd: create a new mbuf pool <mbuf_pool_socket_1>: n=523456, size=2176, socket=1
testpmd: Allocated 2172MB of external memory
testpmd: preferred mempool ops selected: ring_mp_mc
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: xmem

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=0
Press enter to exit

Telling cores to stop...
Waiting for lcores to finish...

  +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
  RX-packets: 0              RX-dropped: 0             RX-total: 0
  TX-packets: 0              TX-dropped: 0             TX-total: 0
  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Done.

Bye...
fengggli commented 5 years ago

I don't necessarily need dpdk at this stage, I could just run mcas/testcl.cc in the virtualmachine instead(https://spdk.io/doc/env_8h.html#a0874731c44ac31e4b14d91c6844a87d1)

fengggli commented 5 years ago

I could use get_user_pages?(https://lwn.net/Articles/753027/, http://nuncaalaprimera.com/2014/using-hugepage-backed-buffers-in-linux-kernel-driver)

how pinned memory is managed in infiniband:

  1. ib_umem_odp_map_dma_pages

how kernel handles MAP_HUGETLB:

  1. user application: https://elixir.bootlin.com/linux/v4.15.18/source/mm/gup.c#L1088
  2. in the kernel it's handled here (https://elixir.bootlin.com/linux/v4.15.18/source/mm/mmap.c#L1526) the second mmap(attach) in mcas is not a ANONYMOUS map, which will return error directly
  3. dax (https://elixir.bootlin.com/linux/v4.15.18/source/drivers/dax/device.c#L536)

  1. hugetlb_get_unmapped_area(https://elixir.bootlin.com/linux/v4.15.18/source/arch/x86/mm/hugetlbpage.c#L144)
  2. from ldd3 book, previous edition, refer to the copy in my zotero
  3. understanding linux kernel also has corresponding contents covering why we don't need to manipulate page tables directly since 2.6, mentioned in P418
fengggli commented 5 years ago

What I could do is add a iocontrol in mcas (couldn't use mmap here, since directly mmap to non-hugepagefile(file which use the exact same fileoperations) will result unsuccessful mmap (https://elixir.bootlin.com/linux/v4.15.18/source/mm/mmap.c#L1507))

fengggli commented 5 years ago

image

Figure out how spdk block abstraction interacts with libpmemblk (where is the iomem?) Steps:

  1. https://github.com/fengggli/spdk/blob/f85f7cb38ea02c854584ec8cc18239a48c5ca44f/lib/bdev/pmem/bdev_pmem.c#L200
  2. instead of use spdk_nvme_ns_cmd_write (in blocknvme), bdev_pmem used spdk_bdev_io, i saw it will call pmemblk_write(pbp, buf, blockno) (https://github.com/fengggli/spdk/blob/f85f7cb38ea02c854584ec8cc18239a48c5ca44f/lib/bdev/pmem/bdev_pmem.c#L78); to interact with pmdk(buf is just virtual address i think), then a memory copy is performed! (https://github.com/pmem/pmdk/blob/1e676ffffe4190c858dfc2e1eadfcea0cd1a3952/src/examples/libpmemobj/pmemblk/obj_pmemblk.c#L282)
fengggli commented 5 years ago

...



* In name is passed in src/lib/core/src/dpdk.cpp then passed to dpdk