Open zhchai opened 1 year ago
Duplicate to #2184. I am working on the fix, tentatively planned to be resolved during this week.
@zhchai ~~I see that you are using default temporality for OTLP exporter, which is Cumulative. With this temporality, even when there is no new measurement for a given instrument within a collection period, the previous metric value would be exported. With delta temporality, if there is no new measurement within a collection period, no new metrics would be generated for that instrument. You can configure the temporality to Delta and try again ?~~
EDIT - Please ignore my comment. I realized this is Observable Gauge and should only return the values between the collection interval. Will get this fixed.
@lalitb May I ask what is the cause of this issue? And When will this be fixed? Thanks.
@gaochuang the fix is in the temporal storage for the observable aggregation, targeting to fix during this week and for the next release.
@lalitb Thanks.
@gaochuang @zhchai - Sorry for all the confusion here. I had to revisit our implementation, and also see the behavior of other language implementations (JS, .Net) here. So, to summarize the findings:
For an Observable Gauge with Last Value Aggregation (irrespective of the temporarily configured), the exporter will emit metrics for all unique sets of attributes observed since the start, even if there are no new measurements generated for a specific set of attributes in subsequent reporting periods.
This is how it is implemented in C++, and also in JS and .Net as I validated.
In the example shared here, if there are no new measurements for given process, every collection will still have the metric points for them with the last aggregated timestamp. Perhaps this timestamp can be used in backend as the criteria for when was the process active last.
Not directly related to this issue, but I noticed that we don't have the required unit-tests for Observable Gauge to do validation, I will add them as part of closing this issue.
@lalitb Thanks for the clarification. Would like to ask a question, if I want to avoid this behavior, how can I do it? By the way, Do you have a guide or examples?
if there are no new measurements for given process, every collection will still have the metric points for them with the last aggregated timestamp. Perhaps this timestamp can be used in backend as the criteria for when was the process active last.
Actually we find that the timestamps of these measurements of exited processes are always updated, which is not match with "the last aggregated timestamp".
if I want to avoid this behavior, how can I do it
Ideally I won't advice using the process_name/pid as key's in the measurement attributes. Please read about the cardinality limit. The set of possible values an attribute set can have should be finite, and SDK recommends it to be 2000. In this case, for the attribute set {{"process_name", process.name},{"pid", n}}
, the possible values are infinite. As the processes keep on created/terminated over time in given system, and new attribute values keep generating, and so is the aggregated data points for these attribute values. And these aggregated data points are stored in the SDK memory forever (in case of last-value aggregation), which can lead to cardinality explosion.
The recommendation would be instead to
Unfortunately, I don't see ideal way of creating an application to sending memory-usage of all the currently running processes as measurement. I would recommend creating a GitHub issue in otel-specs repo asking for recommendation.
Actually we find that the timestamps of these measurements of exited processes are always updated
I earlier tested with console exporter, and it shows the last measurement timestamp, and same is specified in specs, will check with OTLP exporter too if it behaves differently.
This issue was marked as stale due to lack of activity.
Describe your environment opentelemetry-cpp 1.9.0 SDK
Steps to reproduce Using one instrument to observe memory of multiple processes, process can appear and disappear dynamically. Measurements of disappeared process are still transmitted out to opentelemetry collector. code snippet:
What is the expected behavior? If the process exit, it's metrics should not be seen in opentelemetry collector.
What is the actual behavior? Measurements of disappeared process are still transmitted out to opentelemetry collector.
Additional context see code snippet.