Closed ktsaou closed 1 year ago
As I mentioned in the related feature request, we probably just need CAP_DAC_READ_SEARCH on the external plugin to achieve this.
Hey @shyamvalsan and @sashwathn ,
The PR bringing what was required in this issue is already ready for review, but I consider this only the first step. Please, take a look in possible metrics that we can add from /sys/kernel/debug
, because the basis are already ready.
Best regards!
@ktsaou @Ferroin a question about extfrag. We have a fragmentation index (the value) per Node and Zone:
# cat /sys/kernel/debug/extfrag/extfrag_index
Node 0, zone DMA -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 0, zone DMA32 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 0, zone Normal -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 1, zone Normal -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.00
The question is about appropriate aggregation - is it possible to aggregate these counters within different zones? If so we should:
mem.fragmentation_index
, labels node
and zone
.mem.fragmentation_index_zone_{zoneName}
, labels node
.And my understanding is that in any case, only min and max aggregation methods provide more or less meaningful results.
From Linux docs
The kernel will not compact memory in a zone if the fragmentation index is <= extfrag_threshold.
Should we collect extfrag_threshold
(/proc/sys/vm/extfrag_threshold) too?
Aggregation by zone independent of NUMA node makes some sense. Aggregation across zones does not make much sense though because the reasons (and solutions) for fragmentation in a given zone type are highly dependent on the zone type. Averages (as opposed to min or max) across NUMA nodes on the same system may make sense here depending on the exact system setup (for a NUMA system that is set up to auto-balance across NUMA nodes, average fragmentation is actually kind of useful, but for one where an entire node is isolated it doesn’t make much sense).
This is also very much a local metric. Aggregation across Netdata nodes makes little to no sense for it in most cases.
Irrespective of aggregation, extrfag_threshold should probably be collected as a chart variable.
Problem
There are many useful metrics that are exposed in debugfs.
For example,
/sys/kernel/debug/extfrag/extfrag_index
provides information about memory fragmentation:or
ls -l /sys/kernel/debug/zswap/
provides statistics about zswap:The problem is that it can only be accessed by
root
:So, we need an external plugin, with enough capabilities / permissions to collect information from it.
Description
As above.
Importance
nice to have
Value proposition
There is useful information in debugfs we could use to expose it to users.
Proposed implementation
External C plugin, with capabilities and permissions similar to
apps.plugin