perfsonar / project

The perfSONAR project's primary wiki and issue tracker.
Apache License 2.0
53 stars 10 forks source link

Creating separate task scheduling thread per interface #1311

Open guhl1956 opened 3 years ago

guhl1956 commented 3 years ago

Feature Suggestion: A multi-homed server with multiple interfaces dedicated to throughput testing could benefit from separate and independent scheduling threads. The current system can create bottlenecks which causes a number of tests to non-start due to scheduling conflicts. Having a schedule per interface could alleviate the scheduling collisions and allow for multiple throughput tests (exclusive category) to be run simultaneously and independently from one another.

mfeit-internet2 commented 3 years ago

This is a problem we've been thinking about since 4.0 under the banner of "resource management." A general solution that doesn't cause parallel measurements to distort each others' results on any machine is a tough nut to crack. Essentially, we'd have to understand and model the behavior of the applications we run and the innards of the wide variety of systems running them.

Historically, we've told people to split that work across machines, but hardware capable of doing the work has become cheap and practical. Fortunately, this can be solved as a system administration problem rather than in the software.

As part of the NGI project, Internet2 is deploying (single) machines to its PoPs to do measurement and carry out a few unrelated utility functions. The systems are equipped with enough cores and memory to get the work done plus multiple 10 GbEs (built into the machine) and two 100 GbEs (on a card). Each hosts two perfSONAR nodes, one for internal use by Internet2 and one for use by outsiders. Each perfSONAR node is run in a Docker container with dedicated CPU cores, memory and a 100 GbE directly attached using Docker's macvlan driver.

Lab experimentation gave us near-full throughput on bare metal and inside the container. We're experiencing a bit less throughput in the field (but still north of 90 Gb/s), but we think that may be attributable to a slight difference between the interface cards and drivers we had in the lab and those that wound up in the field. Once we've got everything settled into its final configuration and have some production experience under our belts, there will probably be a talk on the subject.