Creating separate task scheduling thread per interface

This is a problem we've been thinking about since 4.0 under the banner of "resource management." A general solution that doesn't cause parallel measurements to distort each others' results on any machine is a tough nut to crack. Essentially, we'd have to understand and model the behavior of the applications we run and the innards of the wide variety of systems running them.

Historically, we've told people to split that work across machines, but hardware capable of doing the work has become cheap and practical. Fortunately, this can be solved as a system administration problem rather than in the software.

As part of the NGI project, Internet2 is deploying (single) machines to its PoPs to do measurement and carry out a few unrelated utility functions. The systems are equipped with enough cores and memory to get the work done plus multiple 10 GbEs (built into the machine) and two 100 GbEs (on a card). Each hosts two perfSONAR nodes, one for internal use by Internet2 and one for use by outsiders. Each perfSONAR node is run in a Docker container with dedicated CPU cores, memory and a 100 GbE directly attached using Docker's macvlan driver.

Lab experimentation gave us near-full throughput on bare metal and inside the container. We're experiencing a bit less throughput in the field (but still north of 90 Gb/s), but we think that may be attributable to a slight difference between the interface cards and drivers we had in the lab and those that wound up in the field. Once we've got everything settled into its final configuration and have some production experience under our belts, there will probably be a talk on the subject.

perfsonar / project

Creating separate task scheduling thread per interface #1311