ANLAB-KAIST / NBA

Network Balancing Act: A High-performance packet processing framework for heterogeneous processors
MIT License
55 stars 13 forks source link

Packet and PacketBatch refactoring #1

Closed achimnol closed 9 years ago

achimnol commented 9 years ago

Revive the Packet wrapper (lib/packet.hh) and move annotations into rte_mbuf to reduce batching overheads.

achimnol commented 9 years ago

CPU-only performance of commit 765432d:

shader-seattle: ~/nba/scripts [git:packet-batch-refactor]
> sudo ./run_throughput.py -p 64,128,256,512,1024,1500 rss.py ipv4-router-cpuonly.click
rss.py ipv4-router-cpuonly.click
io-batch-size comp-batch-size coproc-ppdepth pkt-size  Mpps Gbps  cpu-io-usr cpu-io-sys  cpu-coproc-usr cpu-coproc-sys
64     64   32   64   87.05  61.28  98.12   1.85    0.18   0.21
64     64   32  128   65.16  79.23  94.11   3.43    0.16   1.06
64     64   32  256   35.72  80.00  91.62   5.90    0.19   1.06
64     64   32  512   18.66  80.00  88.59   8.93    0.22   1.00
64     64   32 1024    9.54  80.00  84.35  13.16    0.19   1.00
64     64   32 1500    0.00   0.00  72.43  25.21    0.16   0.25

IPv4 performance is more or less same.

shader-seattle: ~/nba/scripts [git:packet-batch-refactor]
> sudo ./run_throughput.py -p 64,128,256,512,1024,1500 rss.py ipv6-router-cpuonly.click
rss.py ipv6-router-cpuonly.click
io-batch-size comp-batch-size coproc-ppdepth pkt-size  Mpps Gbps  cpu-io-usr cpu-io-sys  cpu-coproc-usr cpu-coproc-sys
64     64   32   64   39.99  28.15  96.31   1.34    0.19   0.25
64     64   32  128   39.39  47.90  96.28   1.21    0.19   1.50
64     64   32  256   35.72  80.00  95.54   2.11    0.22   0.28
64     64   32  512   18.66  80.00  91.61   5.86    0.16   1.53
64     64   32 1024    9.54  80.00  86.59  10.88    0.22   1.47
64     64   32 1500    0.00   0.00  71.29  26.22    0.12   1.41

IPv6 performance is increased! (14% for 64 B, 19% for 128 B)

shader-seattle: ~/nba/scripts [git:packet-batch-refactor]
> sudo ./run_throughput.py -p 64,128,256,512,1024,1500 rss.py ipsec-encryption-cpuonly.click
rss.py ipsec-encryption-cpuonly.click
io-batch-size comp-batch-size coproc-ppdepth pkt-size  Mpps Gbps  cpu-io-usr cpu-io-sys  cpu-coproc-usr cpu-coproc-sys
64     64   32   64   11.32  15.04  98.89   1.10    0.18   0.25
64     64   32  128    9.13  16.80  95.81   1.71    0.16   1.10
64     64   32  256    7.43  21.29  95.51   2.00    0.16   1.06
64     64   32  512    5.43  26.67  95.61   1.90    0.22   1.03
64     64   32 1024    3.51  31.57  95.49   2.03    0.16   1.00
64     64   32 1500    0.00   0.00  73.45  24.18    0.19   0.25

IPsec performance is more or less same.

(The generator has problems with 1500 B packets; it's not the problem of NBA.)

achimnol commented 9 years ago

I've confirmed it works well with branches as well. Improving branch prediction (or whatever reducing batch reorganization overheads) will be taken care of by another issue.