usnistgov / ndn-dpdk

NDN-DPDK: High-Speed Named Data Networking Forwarder
https://www.nist.gov/publications/ndn-dpdk-ndn-forwarding-100-gbps-commodity-hardware
Other
131 stars 26 forks source link

large cache space test #43

Closed YuanhWu closed 3 years ago

YuanhWu commented 3 years ago

I am testing the ndn-dpdk forwarder on our local testbed and I want to test some scenarios with large cache space. However, I met some problems. The forwarder activation configuration:

YuanhWu commented 3 years ago

Hugepages:

Node Pages Size Total
0    1024  2Mb    2Gb
0    100   1Gb    100Gb

Activate forwarder in the container

jq -n '{
  eal: {
    memPerNuma: { "0": 102400, "1": 0 }
  },
  lcoreAlloc: {
    RX: { "0": 2, "1": 0 },
    TX: { "0": 2, "1": 0 },
    CRYPTO: { "0": 1, "1": 0 },
    FWD: { "0": 1, "1": 0 },
    HRLOG: { "0": 1, "1": 0 },
    PDUMP: { "0": 1, "1": 0 }
  },
  mempool: {
    DIRECT: { capacity: 2097151, dataroom: 9128 },
    INDIRECT: { capacity: 1048575 }
  },
  pcct: {
    pcctCapacity: 1048575,
    csDirectCapacity: 524287,
    csIndirectCapacity: 524287
  }
}' | docker run -i --rm ndn-dpdk ndndpdk-ctrl --gqlserver $GQLFW activate-forwarder

create face

FACEID=$(jq -n '{
  scheme: "memif",
  socketName: "/run/ndn/fileserver.sock",
  id: 0,
  role: "server",
  dataroom: 9000
}' | docker run -i --rm ndn-dpdk ndndpdk-ctrl --gqlserver $GQLFW create-face | tee /dev/stderr | jq -r .id)

insert fib

docker run -i --rm ndn-dpdk ndndpdk-ctrl --gqlserver $GQLFW insert-fib --name /ndnc/ft --nexthop $FACEID

The face creation failed after I started a consumer. The forwarder logs:

{"level":"info","ts":1635540777.7341838,"logger":"main","msg":"NDN-DPDK service starting","version":"v0.0.0-20211007141541-9160e6d38d82","uid":0,"linux":"5.4.0-87-generic","dpdk":"DPDK 21.08.0","spdk":"SPDK v21.07"}
{"level":"info","ts":1635540777.735129,"logger":"main","msg":"GraphQL HTTP server starting","listen":":3030"}
{"level":"info","ts":1635540778.8331172,"logger":"main","msg":"activate start","role":"forwarder"}
{"level":"warn","ts":1635540778.914111,"logger":"DPDK","msg":"EAL: No free 2048 kB hugepages reported on node 1"}
{"level":"info","ts":1635540778.9142141,"logger":"DPDK","msg":"EAL: 98 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size"}
{"level":"warn","ts":1635540778.9142444,"logger":"DPDK","msg":"EAL: No free 1048576 kB hugepages reported on node 1"}
{"level":"warn","ts":1635540779.24048,"logger":"DPDK.0","msg":"TELEMETRY: No legacy callbacks, legacy socket not created"}
{"level":"info","ts":1635540779.240517,"logger":"ealinit","msg":"EAL ready","args":"-l 0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35,12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47 --socket-limit 102400,1 --in-memory --single-file-segments -d /usr/local/lib/dpdk/pmds-21.3 --no-pci","main":0,"workers":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47],"sockets":[0,1]}
{"level":"info","ts":1635540779.2528126,"logger":"SPDK.0","msg":"SPDK logger ready"}
{"level":"info","ts":1635540779.5952082,"logger":"ealthread","msg":"lcores configured","role":"HRLOG","lc":[29]}
{"level":"info","ts":1635540779.5952559,"logger":"ealthread","msg":"lcores configured","role":"PDUMP","lc":[1]}
{"level":"info","ts":1635540779.5953686,"logger":"ealthread","msg":"lcores configured","role":"TX","lc":[24,33]}
{"level":"info","ts":1635540779.5953815,"logger":"ealthread","msg":"lcores configured","role":"CRYPTO","lc":[3]}
{"level":"info","ts":1635540779.5953872,"logger":"ealthread","msg":"lcores configured","role":"FWD","lc":[6]}
{"level":"info","ts":1635540779.595392,"logger":"ealthread","msg":"lcores configured","role":"RX","lc":[30,35]}
{"level":"info","ts":1635540779.5954778,"logger":"hrlog","msg":"writer ready","lc":29,"capacity":65535}
{"level":"info","ts":1635540780.4467452,"logger":"Pcct","msg":"Init","pcct":"0x17ca7f440"}
{"level":"info","ts":1635540780.4468422,"logger":"Pit","msg":"Init","pit":"0x17ca7f460","pcct":"0x17ca7f440"}
{"level":"info","ts":1635540780.4468634,"logger":"MinTmr","msg":"New","sched":"0x22002083c0","slots":"4096","interval":"73333333","cb":"0x411477"}
{"level":"info","ts":1635540780.446892,"logger":"Cs","msg":"Init","cs":"0x17ca7f4c0","arc":"0x17ca7f4c0","pcct":"0x17ca7f440","cap-md":"524287","cap-mi":"524287"}
{"level":"info","ts":1635540788.2127538,"logger":"eal","msg":"vdev initialized","name":"crypto_openssl_K000000000000000d","args":"max_nb_queue_pairs=1,socket_id=0","socket":0}
{"level":"info","ts":1635540788.213733,"logger":"main","msg":"activate success","role":"forwarder"}
{"level":"info","ts":1635540788.218962,"logger":"FwCrypto.3","msg":"Run","fwc":"0x22002064c0","input":"0x2200206000","pool":"0x211c37740","cryptodev":"0-0"}
{"level":"info","ts":1635540788.2226644,"logger":"FwFwd.6","msg":"Run","fwd-id":"0","fwd":"0x2200218cc0","fib":"0x21523d000","pit":"0x17ca7f460","cs":"0x17ca7f4c0","crypto":"0x2200206000"}
{"level":"warn","ts":1635540789.6231358,"logger":"DPDK","msg":"rte_pmd_memif_probe(): Failed to register mp action callback: Operation not supported"}
{"level":"info","ts":1635540789.6237981,"logger":"eal","msg":"vdev initialized","name":"net_memifK000000000000000f","args":"bsize=9000,rsize=10,socket=/run/ndn/fileserver.sock,socket-abstract=no,mac=F2:6D:65:6D:69:66,id=0,role=server","socket":"any"}
{"level":"info","ts":1635540789.6238627,"logger":"ethface","msg":"port opened","port":0}
{"level":"info","ts":1635540789.624301,"logger":"ethface","msg":"impl initialized","port":0,"impl":"RxMemif"}
{"level":"info","ts":1635540805.054747,"logger":"ethdev","msg":"ethdev started","id":0,"name":"net_memifK000000000000000f","mtu":"unchanged","rxq":1,"txq":1,"promisc":false}
{"level":"info","ts":1635540805.0548337,"logger":"ethface","msg":"face started","port":0,"impl":"RxMemif","face":16759}
{"level":"info","ts":1635540805.054881,"logger":"iface","msg":"face created","id":16759,"socket":0,"mtu":9014,"locator":{"rxQueueSize":64,"txQueueSize":64,"role":"server","socketName":"/run/ndn/fileserver.sock","id":0,"dataroom":9000,"ringCapacity":1024}}
{"level":"info","ts":1635540825.5462806,"logger":"eal","msg":"vdev initialized","name":"net_memifK000000000000001a","args":"id=0,role=server,bsize=9000,rsize=10,socket=/run/ndn/ndnc-memif-125-1635540825541892366.sock,socket-abstract=no,mac=F2:6D:65:6D:69:66","socket":"any"}
{"level":"info","ts":1635540825.5463712,"logger":"ethface","msg":"port opened","port":1}
{"level":"warn","ts":1635540825.5464056,"logger":"DPDK","msg":"rte_pmd_memif_probe(): Failed to register mp action callback: Operation not supported"}
{"level":"panic","ts":1635540825.5466044,"logger":"eal","msg":"ZmallocAligned failed","type":"FaceImpl","size":2048,"socket":1}
panic: ZmallocAligned failed

goroutine 21 [running, locked to thread]:
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0003a4300, {0xc0003a4f00, 0x3, 0x3})
    /root/go/pkg/mod/go.uber.org/zap@v1.19.1/zapcore/entry.go:232 +0x446
go.uber.org/zap.(*Logger).Panic(0x7fb3102619f0, {0xaf8b99, 0x100000040}, {0xc0003a4f00, 0x3, 0x3})
    /root/go/pkg/mod/go.uber.org/zap@v1.19.1/logger.go:230 +0x59
github.com/usnistgov/ndn-dpdk/dpdk/eal.ZmallocAligned({0xae2808, 0x8}, {0x9e2040, 0xc000121788}, 0x1, {0xc0001ece78})
    /root/ndn-dpdk/dpdk/eal/malloc.go:37 +0x5a9
github.com/usnistgov/ndn-dpdk/iface.newFace({{0x40, 0x400, 0x2336, 0x2336}, {0x2}, 0x300, 0xc0003d98f0, 0xc0003dabb8, 0xc0000d2eb0, 0xc0000d2ec0, ...})
    /root/ndn-dpdk/iface/face.go:187 +0x5d9
github.com/usnistgov/ndn-dpdk/iface.New.func1()
    /root/ndn-dpdk/iface/face.go:163 +0x58
reflect.Value.call({0x9d57a0, 0xc0003d3460, 0x41b594}, {0xadb8f2, 0x4}, {0x0, 0x0, 0x426e10})
    /usr/local/go/src/reflect/value.go:543 +0x814
reflect.Value.Call({0x9d57a0, 0xc0003d3460, 0xc0003d9920}, {0x0, 0x0, 0x0})
    /usr/local/go/src/reflect/value.go:339 +0xc5
github.com/usnistgov/ndn-dpdk/core/cptr.Call.func1()
    /root/ndn-dpdk/core/cptr/function.go:132 +0xc5
github.com/usnistgov/ndn-dpdk/core/cptr.ZeroFunctionType.Void.func1()
    /root/ndn-dpdk/core/cptr/function.go:108 +0x1b
github.com/usnistgov/ndn-dpdk/core/cptr.go_functionGo0_once(0xc0002b0340)
    /root/ndn-dpdk/core/cptr/function.go:182 +0x8d
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv._Cfunc_c_SpdkThread_Run(0x22002e9500)
    _cgo_gotypes.go:266 +0x48
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv.(*Thread).main.func1(0xc00059a098)
    /root/ndn-dpdk/dpdk/spdkenv/thread.go:59 +0x3b
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv.(*Thread).main(0xc0001330f8)
    /root/ndn-dpdk/dpdk/spdkenv/thread.go:59 +0x19
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv.InitMainThread(0xc0002fe3c0)
    /root/ndn-dpdk/dpdk/spdkenv/init.go:69 +0x51e
github.com/usnistgov/ndn-dpdk/dpdk/ealinit.Init.func1.1()
    /root/ndn-dpdk/dpdk/ealinit/ealinit.go:57 +0xcb
created by github.com/usnistgov/ndn-dpdk/dpdk/ealinit.Init.func1
    /root/ndn-dpdk/dpdk/ealinit/ealinit.go:46 +0xbd
yoursunny commented 3 years ago

Node Pages Size Total 0 1024 2Mb 2Gb 0 100 1Gb 100Gb

Don't configure both 2MB and 1GB hugepages. Stick to only one size.

{"level":"info","ts":1635540778.9142141,"logger":"DPDK","msg":"EAL: 98 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size"}

You are missing a hugetlbfs mount for 1GB pagesize (or it's not mounted into the container), so that those hugepages are unusable.


It's recommended to setup hugepages using dpdk-hugepages.py script. Don't use kernel command line or other script. See Docker.md for instructions on how to extract this setup from the container image.

Example for 100GB on socket 0 only:

sudo sh -c "while ! dpdk-hugepages.py --pagesize 1G --node 0 --setup 100G; do sleep 1; done"
YuanhWu commented 3 years ago

I use the dpdk-hugepages.py and now the hugepages:

Node Pages Size Total
0    100   1Gb    100Gb

I also rerun the forwarder container like this:

docker run -d --privileged --name fw   --cap-add IPC_LOCK --cap-add NET_ADMIN --cap-add SYS_ADMIN --cap-add SYS_NICE   --mount type=bind,source=/dev/hugepages,target=/dev/hugepages   --mount type=volume,source=run-ndn,target=/run/ndn ndn-dpdk

However, I still get the face creation problem when I start a consumer application. The forwarder logs are:

{"level":"info","ts":1635544778.7097278,"logger":"main","msg":"NDN-DPDK service starting","version":"v0.0.0-20211007141541-9160e6d38d82","uid":0,"linux":"5.4.0-87-generic","dpdk":"DPDK 21.08.0","spdk":"SPDK v21.07"}
{"level":"info","ts":1635544778.7105346,"logger":"main","msg":"GraphQL HTTP server starting","listen":":3030"}
{"level":"info","ts":1635544779.7699943,"logger":"main","msg":"activate start","role":"forwarder"}
{"level":"warn","ts":1635544779.8510532,"logger":"DPDK","msg":"EAL: No available 2048 kB hugepages reported"}
{"level":"warn","ts":1635544779.85115,"logger":"DPDK","msg":"EAL: No free 2048 kB hugepages reported on node 0"}
{"level":"warn","ts":1635544779.8511784,"logger":"DPDK","msg":"EAL: No free 2048 kB hugepages reported on node 1"}
{"level":"warn","ts":1635544779.851203,"logger":"DPDK","msg":"EAL: No available 2048 kB hugepages reported"}
{"level":"warn","ts":1635544779.8512337,"logger":"DPDK","msg":"EAL: No free 1048576 kB hugepages reported on node 1"}
{"level":"warn","ts":1635544780.4784117,"logger":"DPDK.0","msg":"TELEMETRY: No legacy callbacks, legacy socket not created"}
{"level":"info","ts":1635544780.4783936,"logger":"ealinit","msg":"EAL ready","args":"-l 0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35,12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47 --socket-limit 102400,1 --in-memory --single-file-segments -d /usr/local/lib/dpdk/pmds-21.3 --no-pci","main":0,"workers":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47],"sockets":[0,1]}
{"level":"info","ts":1635544780.5262496,"logger":"SPDK.0","msg":"SPDK logger ready"}
{"level":"info","ts":1635544780.5816317,"logger":"ealthread","msg":"lcores configured","role":"PDUMP","lc":[25]}
{"level":"info","ts":1635544780.5816653,"logger":"ealthread","msg":"lcores configured","role":"HRLOG","lc":[7]}
{"level":"info","ts":1635544780.5818112,"logger":"ealthread","msg":"lcores configured","role":"FWD","lc":[4]}
{"level":"info","ts":1635544780.58182,"logger":"ealthread","msg":"lcores configured","role":"RX","lc":[5,31]}
{"level":"info","ts":1635544780.5818257,"logger":"ealthread","msg":"lcores configured","role":"TX","lc":[34,10]}
{"level":"info","ts":1635544780.5818317,"logger":"ealthread","msg":"lcores configured","role":"CRYPTO","lc":[3]}
{"level":"info","ts":1635544780.5819283,"logger":"hrlog","msg":"writer ready","lc":7,"capacity":65535}
{"level":"info","ts":1635544781.3920805,"logger":"Pcct","msg":"Init","pcct":"0x17c897d80"}
{"level":"info","ts":1635544781.3921213,"logger":"Pit","msg":"Init","pit":"0x17c897da0","pcct":"0x17c897d80"}
{"level":"info","ts":1635544781.3921437,"logger":"MinTmr","msg":"New","sched":"0x1a0737700","slots":"4096","interval":"73333333","cb":"0x411477"}
{"level":"info","ts":1635544781.3921647,"logger":"Cs","msg":"Init","cs":"0x17c897e00","arc":"0x17c897e00","pcct":"0x17c897d80","cap-md":"524287","cap-mi":"524287"}
{"level":"info","ts":1635544789.1595993,"logger":"eal","msg":"vdev initialized","name":"crypto_openssl_K000000000000000d","args":"max_nb_queue_pairs=1,socket_id=0","socket":0}
{"level":"info","ts":1635544789.1603754,"logger":"main","msg":"activate success","role":"forwarder"}
{"level":"info","ts":1635544789.16404,"logger":"FwCrypto.3","msg":"Run","fwc":"0x211db7600","input":"0x211db7140","pool":"0x211c34f00","cryptodev":"0-0"}
{"level":"info","ts":1635544789.1661315,"logger":"FwFwd.4","msg":"Run","fwd-id":"0","fwd":"0x17cb98840","fib":"0x21523bbc0","pit":"0x17c897da0","cs":"0x17c897e00","crypto":"0x211db7140"}
{"level":"warn","ts":1635544790.7286015,"logger":"DPDK","msg":"rte_pmd_memif_probe(): Failed to register mp action callback: Operation not supported"}
{"level":"info","ts":1635544790.7287397,"logger":"eal","msg":"vdev initialized","name":"net_memifK000000000000000f","args":"id=0,role=server,bsize=9000,rsize=10,socket=/run/ndn/fileserver.sock,socket-abstract=no,mac=F2:6D:65:6D:69:66","socket":"any"}
{"level":"info","ts":1635544790.7288125,"logger":"ethface","msg":"port opened","port":0}
{"level":"info","ts":1635544790.730524,"logger":"ethface","msg":"impl initialized","port":0,"impl":"RxMemif"}
{"level":"info","ts":1635544805.9143846,"logger":"ethdev","msg":"ethdev started","id":0,"name":"net_memifK000000000000000f","mtu":"unchanged","rxq":1,"txq":1,"promisc":false}
{"level":"info","ts":1635544805.9144914,"logger":"ethface","msg":"face started","port":0,"impl":"RxMemif","face":52768}
{"level":"info","ts":1635544805.9145367,"logger":"iface","msg":"face created","id":52768,"socket":0,"mtu":9014,"locator":{"rxQueueSize":64,"txQueueSize":64,"role":"server","socketName":"/run/ndn/fileserver.sock","id":0,"dataroom":9000,"ringCapacity":1024}}
{"level":"info","ts":1635545081.287487,"logger":"eal","msg":"vdev initialized","name":"net_memifK000000000000001a","args":"socket=/run/ndn/memif-1-1635545081206147451.sock,socket-abstract=no,mac=F2:6D:65:6D:69:66,id=1,role=server,bsize=9000,rsize=10","socket":"any"}
{"level":"warn","ts":1635545081.2874968,"logger":"DPDK","msg":"rte_pmd_memif_probe(): Failed to register mp action callback: Operation not supported"}
{"level":"info","ts":1635545081.2875795,"logger":"ethface","msg":"port opened","port":1}
{"level":"info","ts":1635545081.2897213,"logger":"ethvdev","msg":"memif SocketOwner changed","socketName":"/run/ndn/memif-1-1635545081206147451.sock","uid":0,"gid":0}
{"level":"panic","ts":1635545081.298239,"logger":"eal","msg":"ZmallocAligned failed","type":"FaceImpl","size":2048,"socket":1}
panic: ZmallocAligned failed

goroutine 35 [running, locked to thread]:
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0001f4480, {0xc0001f4600, 0x3, 0x3})
    /root/go/pkg/mod/go.uber.org/zap@v1.19.1/zapcore/entry.go:232 +0x446
go.uber.org/zap.(*Logger).Panic(0x7feaf825a1d0, {0xaf8b99, 0x100000040}, {0xc0001f4600, 0x3, 0x3})
    /root/go/pkg/mod/go.uber.org/zap@v1.19.1/logger.go:230 +0x59
github.com/usnistgov/ndn-dpdk/dpdk/eal.ZmallocAligned({0xae2808, 0x8}, {0x9e2040, 0xc0001225f8}, 0x1, {0xc000346e78})
    /root/ndn-dpdk/dpdk/eal/malloc.go:37 +0x5a9
github.com/usnistgov/ndn-dpdk/iface.newFace({{0x40, 0x400, 0x2336, 0x2336}, {0x2}, 0x300, 0xc000259cb0, 0xc00050cdf8, 0xc0002df1d0, 0xc0002df1e0, ...})
    /root/ndn-dpdk/iface/face.go:187 +0x5d9
github.com/usnistgov/ndn-dpdk/iface.New.func1()
    /root/ndn-dpdk/iface/face.go:163 +0x58
reflect.Value.call({0x9d57a0, 0xc000311020, 0x41b594}, {0xadb8f2, 0x4}, {0x0, 0x0, 0x426e10})
    /usr/local/go/src/reflect/value.go:543 +0x814
reflect.Value.Call({0x9d57a0, 0xc000311020, 0xc000259d10}, {0x0, 0x0, 0x0})
    /usr/local/go/src/reflect/value.go:339 +0xc5
github.com/usnistgov/ndn-dpdk/core/cptr.Call.func1()
    /root/ndn-dpdk/core/cptr/function.go:132 +0xc5
github.com/usnistgov/ndn-dpdk/core/cptr.ZeroFunctionType.Void.func1()
    /root/ndn-dpdk/core/cptr/function.go:108 +0x1b
github.com/usnistgov/ndn-dpdk/core/cptr.go_functionGo0_once(0xc0004a61a0)
    /root/ndn-dpdk/core/cptr/function.go:182 +0x8d
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv._Cfunc_c_SpdkThread_Run(0x17cce9200)
    _cgo_gotypes.go:266 +0x48
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv.(*Thread).main.func1(0xc000310018)
    /root/ndn-dpdk/dpdk/spdkenv/thread.go:59 +0x3b
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv.(*Thread).main(0xc0004b2018)
    /root/ndn-dpdk/dpdk/spdkenv/thread.go:59 +0x19
github.com/usnistgov/ndn-dpdk/dpdk/spdkenv.InitMainThread(0xc0001f4480)
    /root/ndn-dpdk/dpdk/spdkenv/init.go:69 +0x51e
github.com/usnistgov/ndn-dpdk/dpdk/ealinit.Init.func1.1()
    /root/ndn-dpdk/dpdk/ealinit/ealinit.go:57 +0xcb
created by github.com/usnistgov/ndn-dpdk/dpdk/ealinit.Init.func1
    /root/ndn-dpdk/dpdk/ealinit/ealinit.go:46 +0xbd
yoursunny commented 3 years ago

{"level":"panic","ts":1635545081.298239,"logger":"eal","msg":"ZmallocAligned failed","type":"FaceImpl","size":2048,"socket":1}

NDN-DPDK is trying to allocate memory on socket 1, but there's no hugepages there, so that memory allocation fails. DPDK does not offer a way to pin a virtual device to a particular NUMA socket.

If you want this NDN-DPDK instance to run only on socket 0, you need to also limit CPU cores to only socket 0. You may do so through one of these options:

YuanhWu commented 3 years ago

Thanks, it works!