Open lukego opened 8 years ago
Short answer: When we have an alternative that's simpler, sure.
Long answer: core.shm
is indeed a kind of DSL. I think its API is nice (map
, unmap
, unlink
), the “name syntax” is slightly foreign but useful. I found core.shm
to be a breeze to work with and using it very enjoyable. top
and gc
are short, simple programs, even though the need for gc
is a wart.
For example to me it is not immediately obvious how much risk there is of multiple Snabb Switch processes on the same machine accidentally over-sharing link objects when using the multiprocessing support from #601 [...]
If I understand correctly, its safe as long as only one process transmits on a given “link name”. Note: This should be well documented.
I wonder if we should consider moving to a new counter format
Does core.shm
restrict our choices in counter formats? I would defer deciding on core.shm
's fate until compared it with its potential replacements.
Why not make "snabb top" be able to populate an RRD file? It seems to me that if we did RRD/RRA writing directly from the data plane, so to speak, that is overhead and complexity that would be nice to avoid. Dunno. Exposing raw named counters seems to me like a nice facility that can be built on fairly flexibly :)
OK I am mentally working through your list of objections and I see what you mean. However I don't see how using a different counter format would fix these. Would you avoid exposing counters to external processes?
Don't shoot me...
Snabb Switch has a module called
core.shm
that makes it easy to allocate C data structures as named shared memory objects that can be accessed from other processes. This became the backend for thecore.counter
module that defines named counters that can be accessed externally e.g. viasnabb top
. The module has also been valuable for rapidly experimenting with different multiprocessing models while we investigated which one suited us best.Question: is the complexity of this module justified by the relatively few ways that we are actually using it?
Overall we want to have as little code and concepts in the core of Snabb Switch as possible. The
core.shm
module is a fairly elaborate concept with some surprisingly subtle issues:mmap()
to allocate objects and this gives them page-aligned addresses which causes their cache lines to collide (in practice we have had to take special care to avoid this in the counter module - see #566).core.shm
objects were the most complex ones -- in the end we were drawn to a "shared nothing" design.)Relatedly: I am also cooling to the representation of counters as files on disk that contain a 64-byte value. Having only one value for a counter seems limited and is usually not what I want. I am more interested to see how the value of a counter has changed over time rather than what its value is right now.
I wonder if we should consider moving to a new counter format (again... sorry...) that is more in the spirit of RRDtool that tracks the values of counters over time at different resolutions (per second, per minute, per hour, per day) and lets you immediately review the historical values of counters (perhaps spanning multiple instances of the process). This seems like it would be more valuable information for
snabb top
to present than only the current value (or the values since the command started running). In practice I find that I don't usesnabb top
but instead eyeball the "engine load reports" that snabbnfv prints every second.cc @alexandergall @javierguerragiraldez @eugeneia