snabbco / snabb

Snabb: Simple and fast packet networking
Apache License 2.0
2.96k stars 298 forks source link

Idea: Double the performance of Snabb NFV #710

Open lukego opened 8 years ago

lukego commented 8 years ago

Here is a fun idea to play with in the background: How about doubling the performance of Snabb NFV?

Why

Our original performance target was to handle 10 Gbps per core for realistic ISP workloads based on a reference processor that Intel use in their NFV case studies. Such performance was unheard of with Virtio-net at that time and we had to write the QEMU code to make it possible. There were few if any virtual machines available that were optimized to keep up with millions of packets per second on their Virtio-net interfaces.

Time is moving forwards though. High-speed Virtio-net is an established idea now. Optimized applications like Igalia's snabb-lwaftr can use the available capacity and more. The clock speeds on Intel's latest high-end CPUs are lower than the previous generation.

Doubling the performance of Snabb NFV would be valuable. On the one hand it would enable efficient deployments with only one reserved core per 20G of traffic. On the other hand it would provide a performance buffer for maintaining 10G per core performance in the presence of complications like slow processors, NUMA mismatches, and performance-affecting bugs.

How

Snabb NFV processing cost is split fairly evenly between:

and also for client/server applications like iperf/apache/postgresql/etc:

So one optimization strategy would be to double the performance of each item on that list.

Here is an initial sketch of how that might play out:

Here is a placeholder list to see how we are doing:

And interesting performance milestones:

kbara commented 8 years ago

That would be amazing.

mwiget commented 8 years ago

I'm currently using VMDq to create two logical interfaces connecting to either side if Igalia's snabb-lwaftr. Initially just for testing over a single loopback, but its actually very useful to place the IPv4 and/or IPv6 side into a VLAN or run the app "on a stick" using virtual MAC addresses. Taking this idea a step further, why not launch multiple lwaftr apps via VMDq over the same interface and optionally in the same VLAN (or untagged). If you read this far, you probably shout that this is exactly what snabbnfv does. And yes, it does, but wouldn't it be great to have separate snabb processes sharing a physical port via VMDq? That would give us/me an immediate boost by running multiple instances of lwaftr and hit them by flows spread across them. @wingo already solved the issue of sharing binding tables with many processes via shared memory mapped file. Suddenly just doubling the performance of Snabb NFV feels so yesterday ;-).