scionproto / scion

SCION Internet Architecture
https://scion.org
Apache License 2.0
382 stars 160 forks source link

router: poor performance over real NICs, despite descent performance over veth #4593

Open jiceatscion opened 1 month ago

jiceatscion commented 1 month ago

The router code itself can demonstrably forward 800K small packets per second and 10Gb/s of traffic in larger (2K packets) when benchmarked over veth. However, the observed performance is less that 1/2 that when using real NICs, including 10GigE NICs.

Since the processing code isn't the bottleneck, it has to be either the effect of the real NICs activity on the overall system (e.g. the interrupt processing overhead), or the impact of the API used by the router (regular UDP socket) on real I/O versus virtual.

Creating this work item to track investigation and resolution.

jiceatscion commented 19 hours ago

The ultimate performance would come from using some XDP-based approach. The DPDK framework seems less work than doing it from scratch (https://doc.dpdk.org/guides/index.html); however, using from Go code isn't trivial either; but apparently feasible (https://pkg.go.dev/githhttps://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.adv-qdisc.ingress.htmlub.com/yerden/go-dpdk, https://pkg.go.dev/github.com/millken/dpdk-go).

This article makes a series of less involved suggestions that do not require XDP/DPDK: https://medium.com/@pavel.odintsov/capturing-packets-in-linux-at-a-speed-of-millions-of-packets-per-second-without-using-third-party-ef782fe8959d may be that's a worthy first step. There is a significant drawback: the ring API is meant for traffic sniffing; packets which destination does match the local host are duplicated: one copy to the sniffer's ring and the other to the kernel's network stack. To prevent that we'd need to play games; possibly games that nullify the benefits, but may be not. One way is to filter the traffic with https://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.adv-qdisc.ingress.html (here's a tutorial: https://www.dasblinkenlichten.com/working-with-tc-on-linux-systems/)

If my understanding is correct. The following sequence would suppress all traffic from eth0 to the kernel stack, leaving only the ring's copy:

$ tc qdisc add dev eth0 ingress
$ tc filter add dev eth0  parent ffff: matchall action drop

... most likely the packet still gets copied before being dropped. Too bad.