Closed muraliran closed 7 years ago
The vport kernel module is not optimized for high "bits per second". It doesn't support TCP/IP offloading features that can be leveraged by Linux kernel (#390) to improve performance.
What's happening here is that jumbo-sized 64KB packets are being used with docker0 thanks to TSO/LRO, whereas standard 1.5KB packets are being used with BESS vports. Having smaller packets means a larger number of packets, increasing the load on the Linux network stack.
Since the performance bottleneck is not in BESS, it's likely that having more connections/ports/containers would increase the overall throughput. Also check bessctl run perftest/loopback_vport
The default docker bridge is just plain linux bridge, so I am not sure if tso/lro is used. I think with bess there is packet loss over 256b sizes, Here is a bit more detailed view. I also see retransmits using tcp why tcp bw is severely hit. I haven't found why there are packet drops. This is udp bw test. At equal or less than 256 bytes bess does better as there are no packet loss.
Both containers running on same host machine so the nic is not hit. It is just kernel vs user mode switching.
Almost all software interfaces in Linux support gso/gro by default (ethtool -k docker0
). It would be clear to see the packet counters with ifconfig
. Besides segmentation, there are also other things to consider:
send()
will block until the receiver queue has some empty space. This is why we never see packet drop in local UDP streams. This flow-control mechanism cannot be applied to remote UDP streams, since BESS acts as a network here.Thanks for the insight Sangjin. You are correct. The packets are not dropped - I just looked at packet count from bessctl show pipeline after each test. All packets are reaching the destination. Looking at send_bw and recv_bw with qperf I assumed probably packets are not reaching, but from what you say looks like the linux rate limit may explain why the bw gets lower. I will study more on this. This must work better than linux bridge as we see from qperf data above, for linux br the send and recv bw are identical as high as 32k packet size. At 64k the communication does not sync with both docker0 and bess.
Is it possible or do you plan to use gso/gro with bess?
The EM looks for arp-req and sends to arp-resp which is working fine - you see only one lookup for arp update. It is not much different from standard switch that classifies protocol first.
I was testing some performance numbers and not understand why the bandwidth is so low. Two containers connected by docker0 (172.17...) or through bess (10.21...)
I don't understand why tcp_bw and udp_bw is so much lower using bess? Am I missing anything in config? I use one standard worker.
` using docker0: root@7cc9e062e6c6:/# ./qperf 172.17.0.3 udp_bw udp_lat udp_bw: send_bw = 3.66 GB/sec recv_bw = 3.66 GB/sec udp_lat: latency = 8.42 us root@7cc9e062e6c6:/# ./qperf 172.17.0.3 tcp_bw tcp_lat tcp_bw: bw = 3.73 GB/sec tcp_lat: latency = 9.8 us
using bess pipeline above: root@7cc9e062e6c6:/# ./qperf 10.20.21.14 tcp_bw tcp_lat tcp_bw: bw = 559 MB/sec tcp_lat: latency = 10 us root@7cc9e062e6c6:/# ./qperf 10.20.21.14 udp_bw udp_lat udp_bw: send_bw = 1.31 GB/sec recv_bw = 937 MB/sec udp_lat: latency = 8.87 us `