Open rogercoll opened 2 years ago
cc @dmitryax
/cc @dashpole I think you are using this as well
I don't think either of those proposals improves the problem with the values not adding up. I'd like to add proposal 3, fixing the set of states to exactly cover the memory on the system. If there are states that double-count memory, they should be separate metrics. Here are the non-overlapping states from Munin and how they calculate them from /proc/meminfo as an inspiration:
apps (MemTotal-MemFree-Buffers-Cached-Slab-PageTables-Percpu-SwapCached)
buffers (Buffers)
swap (SwapTotal-SwapFree)
cached (Cached-Shmem)
free (MemFree)
shmem (Shmem)
slab_cache (Slab)
swap_cache (SwapCached)
page_tables (PageTables)
per_cpu (Percpu)
vmalloc_used (VmallocUsed)
(I'm not sure if it makes sense to include swap; I'm guessing that with the current set of states we're not including swap? You could leave that out and I think the remaining states would represent total physical memory.)
I think the main difference between this and proposal 2 is that here we're changing the definition of used
(what Munin calls apps
) instead of changing the definition of free
.
@quentinmit I really like your proposal to solve the sum assumption, it provides much more information to the user and we prevent having overlapping states.
Tested without the Swap states as they are covered in the paging scraper and the sum usage is correct. Nonetheless, I have a couple of concerns:
percpu
state (I can do the PR)In my opinion, the available or non-available state is very useful metric when defining alerts. The metric used
(or apps) should not be understood as non-available memory, usually, the latest is quite higher.
Proposal 3
Another possible solution would be to add a new memory metric only for this purpose, as making any change on the current states would break the sum assumption.
The memory system scrapper will generate the following metrics:
- system.memory.usage: state attributes
- system.memory.utilization: disabled by default, state attributes
- system.memory.available: no state attributes
I would appreciate any opinion on this
Whether using the used
state memory or the total - available
memory can be truly significant in alerting systems. I made a small test to check the average variation between both values while stressing the memory:
The average difference is around 7/8% . By providing both values we let the user choose whatever it's best.
Another possible solution would be to add a new memory metric only for this purpose
Is this the same as your proposed solution 1?
I like the idea of adding a new metric; allows those who currently rely on used
to continue to use it, while providing a more accurate measure for those who want it.
@TylerHelmuth No, thanks for pointing that out. It would be a new proposal, let me update the comment.
Proposals 1 and 2 are based on the "state" attribute used in usage and utilization metrics. Proposal 3 won't do any modification on the state neither the current metrics, instead, add a new one.
Adding another metric system.memory.available
maybe be a good solution to one the problems described in this issue. But it's still unclear how to solve the total mismatch issue and also adding this metric brings confusion to end user questioning what's the difference between system.memory.available
and system.memory.usage
.
Potentially, we can solve both problems by introducing a concept of optional metric attribute values to the metrics builder. The attributes values can be enabled or disabled in metadata.yaml (default) and user configs. If attribute value is disabled, it means that datapoints with this attribute value won't be emitted.
With this functionality added to the metrics builder, we will have the following sets of state
attribute values in this receivers:
buffered
, cached
, free
and used
. Sums up to a correct total value.slab_reclaimable
, slab_unreclaimable
and available
. Optional values that user can enable.What do you think?
@bogdandrutu @djaglowski please share your opinion on the additional feature to the metrics builder
I think we should aim for a design that adheres to the specification's aggregation expectations (ie. data points sum to a meaningful total). Supporting metrics that violate these expectations seems like the wrong approach to me.
There seem to be two layers to the problem here:
I don't have a strong opinion on the first question, but in regards to the second, I think we basically have to decide whether to fix it, or to deprecate and replace it. The right approach depends on stability level. In my opinion, a per-metric stability level is necessary in order to have a clear approach (eg. alpha -> beta -> stable -> deprecated). If this metric is less than stable, then we should fix it. If it is "stable" but realized to be fundamentally flawed in its design, we should deprecate it and add one or more new metrics that amount to a more accurate model.
I think we should aim for a design that adheres to the specification's aggregation expectations (ie. data points sum to a meaningful total). Supporting metrics that violate these expectations seems like the wrong approach to me.
That's true, we should always aim to have meaningful total values. Proposing optional attribute values, I imply that the sum of attribute values enabled by default should represent the most reasonable total value. At the same time, I can think of use cases when user wants to disable some attributes values or add others optional ones and have their own custom meaningful sums out of them. In this case, moving slab_reclaimable
and slab_unreclaimable
to a different metric to get correct total value would significantly reduce flexibility of using them because they are part of the same memory usage metric.
I think it's okay to keep SlabReclaimable
and SlabUnreclaimable
in the regular summable metric; their sum is just Slab
. So you can include either slab
or slab_reclaimable
and slab_unreclaimable
but not both. Other states are the problematic ones.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
I think this issue is still valid and worth addressing (@dmitryax) as we're clearly missing system.memory.available
metrics.
And so I'd like to reopen the discussion as to me it seems like adding system.memory.available
is what all of us agree to do (especially if it's just a matter of using a value already exposed by gopsutil
).
As for fixing the flawed sum, I like this approach:
With this functionality added to the metrics builder, we will have the following sets of
state
attribute values in this receivers:
- Enabled by default:
buffered
,cached
,free
andused
. Sums up to a correct total value.- Disabled by default:
slab_reclaimable
,slab_unreclaimable
andavailable
. Optional values that user can enable.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@fatsheep9146 maybe I could take this one and try implementing those changes?
@lootek It's yours!
@fatsheep9146 That's great, thanks! I'll followup soon :)
I have a draft PR prepared with the available
memory state attribute added, yet this is for now the implementation of the simplest solution that just adds this state.
As for making some of the attributes optional, since I'm new to the codebase and don't yet have a full grasp of it I'd love to hear feedback from you and ask you for some initial guidance.
I spent quite a long time staring at the metadata.yaml
file and a few solutions came to my mind yet none of those I'm happy with... Particularly because I was focused on a solution that would be fully backward-compatible but all of those required some changes to mdatagen
by at least adding another field aside enum
.
The one I can see that could require changes only to the memoryscraper itself was Inspired by filters available in eg. filesystemscraper
configuration. Eventually, I landed in the file containing AttributesMatcher
implementation - do you think it would be any useful here?
If the changes to mdatagen
turn out to be inevitable, to my understanding the desired order of work would be to first do those in a separate PR and then issue another PR addressing the particular case of available memory using those changes?
// EDIT: I made a PR to the upstream repo out of this work as well:
It looks like some related work has been merged just a few hours ago:
Given that I wonder if we still need a special logic for grouping/choosing memory state
attributes here. In case we do, I've prepared a PR (built on top of that previous one) that adds a config option for that - please take a look:
The discussion follows in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/19149
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Hi folks. I'm interested in having available
memory. Below I summarize my understanding of what is done, which decisions are made and which are still pending approval, and what can be the next steps.
Mainly, discussions on two issues:
A. The sum of system.memory.usage
metric's state
attribute values do not sum to "total memory of the system".
B. How to start exposing Linux Kernel's available memory
to users?
My impression is that these issues can be solved independently. Hence, I'll describe them in detail separately.
state
is badThe problematic states are slab_reclaimable
and slab_unreclaimable
: [1]. Because of them, the sum over state
attribute is higher than the total memory. The "slab" is a term internal to Linux Kernel: [2].
dmytryax@ proposed having slab_...
states disabled by default. lootek@ implemented this proposal in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/19149.
The implementation sparkled more discussion: e.g. https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/19149#discussion_r1134025928 by djaglowski@. He pointed out that if we write down all the mathematical relationships between the memory states, we'll be able to converge faster to a correct data model. The implementation was rejected (or abandoned - either way, it's not merged).
As a result, another proposal took precedence: introduce a new metric system.memory.slab
, and deprecate the slab states from system.memory.usage
. rogercoll@ created an issue to add such metric to the specification first: https://github.com/open-telemetry/semantic-conventions/issues/531. The issue got no comments since March.
available
memory from Linux Kernel'rogercoll@ showed that available
is really a different state: [3]. He also showed that available
is unique to Linux Kernel: [4].
dmitryax@ proposed to add a separate metric system.linux.memory.available
. He asked other maintainers for agreement before starting the implementation. This is where the story ends so far.
Based on the above, I think the main step is to understand the correct data model (quoting djaglowski@'s https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/7417#issuecomment-1115126721), i.e. to:
system.memory.usage
metricgopsutil
's nomenclature: codefree
and used
. Everything else (like buffered
) is OS-specific, and can go to OS-specific metrics. However, I don't know if that's a common practice in OTel semconv to split metrics by OS.IMO, this will help us to build a long-term solution, which will be stable to library changes and OS changes. However, such solution might be hard to use: e.g. users would query system.memory.usage
for general memory usage, and then query system.linux.memory.some_linux_specific_metric
for details. I'm not sure I'm qualified enough to reason about this, because I haven't talked to any OTel customers. Tagging @dmitryax @braydonk from system semantic conventions WG.
Alternatively, if we want to build a short-term solution, which still looks very pleasant, the next steps are:
For issue A:
slab
-related states in system.memory.usage
For issue B:
system.linux.memory.available
As a result, both issues are solved.
[1] https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/7417#issue-1117282239 [2] https://github.com/open-telemetry/semantic-conventions/issues/531 [3] https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/7417#issuecomment-1109571183 [4] https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/19149#issuecomment-1467634451
I propose to walk the "short-term" path. I created https://github.com/open-telemetry/semantic-conventions/issues/257 to add system.linux.memory.available
.
https://github.com/open-telemetry/semantic-conventions/issues/257 is done. Next step: I'll implement system.linux.memory.available
in hostmetricsreceiver
.
I implemented all steps for "Issue B" from https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/7417#issuecomment-1668082340.
So, answering the original question from this bug - "Can we get MemAvailable field from current states?": now yes. I propose to close this bug as done.
Feel free to open a separate thread for "Issue A" from https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/7417#issuecomment-1668082340. I see that https://github.com/open-telemetry/semantic-conventions/issues/531 is still open, and IMO it's not critical.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Feel free to open a separate thread for "Issue A" from #7417 (comment). I see that open-telemetry/semantic-conventions#531 is still open, and IMO it's not critical.
The linked semconv issue doesn't address this bug - it doesn't fix the free
state to represent something meaningful, and it doesn't propose fixing anything about the existing metric. I think we should keep using this issue for that, given that most of the comments are about it.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
I would keep this issue open too, the existing memory metrics won't be aligned with the semantic conventions until the slab
state is replaced.
SemConv progress update: https://github.com/open-telemetry/semantic-conventions/pull/1078
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
/unstale system.linux.memory.slab
metric will be available in the next semconv release: https://github.com/open-telemetry/semantic-conventions/pull/1078
The current Linux memory states provided by the
hostmetricsreceiver
might be not the most appropriate ones for an end user. The available memory states for Linux are:[buffered, cached, free, slab_reclaimable, slab_unreclaimable, used]
All of those values are retrieved with gopsutil, which gets them from
/proc/meminfo
file.A user might think that the
free
value represents the amount of memory that the system can use, but free (MemFree) is a count of how many pages are free in the buddy allocator, however there are a lot of ways pages can be returned to the buddy allocator in time of need. The latest is provided by the Linux kernel (>3.16) asMemAvailable
in/proc/meminfo
.Can we get MemAvailable field from current states?
No, it is an independent field not computed with the current values. Gopsutils provides this value regardless the kernel version as it can be estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the "low" watermarks from /proc/zoneinfo.
Repercussion
The user will be able to know the actual available memory of the system.
It is already broken and quite difficult to achieve it as some of the memory state values are inclusive, for example the cached state. The assumption would be defined as:
total = buffered + cached + free + used + slab_reclaimable + slab_unreclaimable
But the
used
value is computed by gopsutils as: total - free - buffered - cachedThen:
total = buffered + cached + free + (total - free - buffered - cached) + slab_reclaimable + slab_unreclaimable
total = total + slab_reclaimable + slab_unreclaimable
The sum(state) will always be higher than the total.
(The following proposals do not solve the sum assumption, they only add more memory information for the user)
Proposal 1
Add a new memory state for Linux to provide the available memory. State name:
avaialble
Proposal 2
Change
used
state for:total - available
This proposal requires to add the memory.limit metric as we won't be able to compute the total memory from the states. Currently, we can get the total amount of memory with the following promQL query:scalar(sum without (state) (system_memory_usage{state =~ "free|used|cached|buffered"}))