Closed MrAlias closed 9 months ago
I would be interested to hear how Java, .NET, and other stable implementation behave here. Unfortunately, my ability to write examples there are non-existent.
The Python implementation sounds buggy to me.
For additive (e.g. Counter, UpDownCounter, Asynchronous Counter, Asynchronous UpDownCounter) values, removing a spatial dimension should result in sum.
@reyang can you comment on how .NET handles this? Does that project add them together?
https://github.com/open-telemetry/opentelemetry-specification/issues/1874 Old issue discussing this.
I can confirm that .NET does not handle this correctly, and has the same bug as python. (I'll create an issue to track)
1874 Old issue discussing this.
I can confirm that .NET does not handle this correctly, and has the same bug as python. (I'll create an issue to track)
https://github.com/open-telemetry/opentelemetry-dotnet/blob/73f8d3cb0160f633661854f41866dcdb70b81069/test/OpenTelemetry.Tests/Metrics/MetricAPITest.cs#L800-L803 UnitTest showing the same bug in .NET. (I looked at it at that time and it was not trivial to fix, so just added test to come back to this later.)
1874 Old issue discussing this.
I can confirm that .NET does not handle this correctly, and has the same bug as python. (I'll create an issue to track)
Gotcha, thanks for confirming :pray:
@jack-berg can you confirm the behavior of async counters in Java in the presence of an "attribute reducing" view? I'm wondering if there is any implementation that sums these values.
This is a good question. The Lightstep metrics SDK and the v0.31 and earlier OTel-Go metrics SDK did sum these instruments. As a case where this matters, potentially, is the Golang runtime/metrics package, which outputs cumulative Counter and UpDownCounter values in a form that is natural for use with asynchronous instruments.
For example, runtime/metrics outputs three metrics, one is a total of the other two:
/gc/cycles/automatic:gc-cycles
Count of completed GC cycles generated by the Go runtime.
/gc/cycles/forced:gc-cycles
Count of completed GC cycles forced by the application.
/gc/cycles/total:gc-cycles
Count of all completed GC cycles.
In the instrumentation package I wrote (forked from the contrib repo), https://github.com/lightstep/otel-launcher-go/tree/main/lightstep/instrumentation/runtime, the pattern is used to discard the total, since the SDK or a downstream consumer can easily recompute the total using attribute removal and the natural merge function for the data point. So in the example above, a cumulative GC count for class=forced
and class=automatic
will be generated. If you remove class
, you get the total back. There are three groups of metrics in the Go-1.20 runtime/metrics package that follow this pattern. For Go-1.20, we get hierarchical CPU timing measurements, so you could filter in more than one meaningful way.
A good way to think about this problem: whether the dimension is removed from the SDK (e.g. by configuring View), or it is removed by the Collector (by reaggregation), or it is removed at the backend/comsumption (via PromQL), the results should be the same.
For additive (e.g. Counter, UpDownCounter, Asynchronous Counter, Asynchronous UpDownCounter) values, removing a spatial dimension should result in sum.
@reyang should the Asynchronous Counter
, Asynchronous UpDownCounter
be added to the list of instrument at the end of that section that are treated as additive?
Also, should the API specification be updated to normatively require these instrument types be treated as additive?
otel-cpp implementation does the sum for the sync counters matching the "attribute reducing" view, but selects the last value for the async counter. Need to be fixed, will create a issue to track this.
For additive (e.g. Counter, UpDownCounter, Asynchronous Counter, Asynchronous UpDownCounter) values, removing a spatial dimension should result in sum.
@reyang should the
Asynchronous Counter
,Asynchronous UpDownCounter
be added to the list of instrument at the end of that section that are treated as additive?
Which list?
Also, should the API specification be updated to normatively require these instrument types be treated as additive?
I think no. Additive property applies to Sums, and is covered by the SDK specification https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#default-aggregation. It is perfectly fine for the SDK user to configure an UpDownCounter as something different (e.g. gauge) and not apply additive property.
For additive (e.g. Counter, UpDownCounter, Asynchronous Counter, Asynchronous UpDownCounter) values, removing a spatial dimension should result in sum.
@reyang should the
Asynchronous Counter
,Asynchronous UpDownCounter
be added to the list of instrument at the end of that section that are treated as additive?Which list?
The list included in the sentence at the end of the linked section:
In OpenTelemetry, each Instrument implies whether it is additive or not. For example, Counter and UpDownCounter are additive while Asynchronous Gauge is non-additive.
Also, should the API specification be updated to normatively require these instrument types be treated as additive?
I think no. Additive property applies to Sums, and is covered by the SDK specification https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#default-aggregation.
Where is it covered? I don't see anywhere in that linked section defining sums as additive.
It also seems like the term additive is applied to instruments, not aggregations. This would make sense as that is where attributes are also defined. i.e.
each Instrument implies whether it is additive or not
The list included in the sentence at the end of the linked section:
In OpenTelemetry, each Instrument implies whether it is additive or not. For example, Counter and UpDownCounter are additive while Asynchronous Gauge is non-additive.
I think yes it can, not necessarily as it is only providing examples (without promising that it will be a complete list).
Where is it covered? I don't see anywhere in that linked section defining sums as additive.
It also seems like the term additive is applied to instruments, not aggregations. This would make sense as that is where attributes are also defined.
Yeah, I can see the challenge here.
The data model specification covered "additive" in the temporality section, and temporality is referenced by the sums section. I wrote the supplementary guideline after discussing with @jmacd, hoping it will provide some help for folks who struggle to understand the data model and API specification. Maybe we should improve the data model spec to bring more clarity by making it more streamlined.
Just to get a sanity check: if we don't currently have normative requirements for this in the specification and stable releases of implementations don't do this, does OTel really require this behavior? Is it just considered undefined currently?
Can we add it in a backward-compatible way? Will having implementations change their behavior be backwards-compatible?
Where is it covered? I don't see anywhere in that linked section defining sums as additive. It also seems like the term additive is applied to instruments, not aggregations. This would make sense as that is where attributes are also defined.
Yeah, I can see the challenge here.
The data model specification covered "additive" in the temporality section, and temporality is referenced by the sums section. I wrote the supplementary guideline after discussing with @jmacd, hoping it will provide some help for folks who struggle to understand the data model and API specification. Maybe we should improve the data model spec to bring more clarity by making it more streamlined.
Gotcha. Any help clarifying this in the specification would be appreciated :pray:
Just to get a sanity check: if we don't currently have normative requirements for this in the specification and stable releases of implementations don't do this, does OTel really require this behavior? Is it just considered undefined currently?
Can we add it in a backward-compatible way? Will having implementations change their behavior be backwards-compatible?
I think the spec does require it, although it is not expressed in a normative way. Here is my understanding of the complex mental network:
Given all these steps, 90% chance folks will run into a trap unless they asked someone who has been deeply engaged in all the three specs...
Java doesn't do this correctly either. Opened open-telemetry/opentelemetry-java#4901 to track.
I want to try and clarify the workings of the async instruments and their default aggregation.
In the case that was raised here, there is a filter and two observes will record to the same filtered counter, we expect the result is a sum of both observations.
What about other similar cases:
Would it be a good idea to put other a set of edge cases, and the expected output?
the workings of the async instruments and their default aggregation.
In the case that was raised here, there is a filter and two observes will record to the same filtered counter, we expect the result is a sum of both observations.
What about other similar cases:
- If there is no filter, and two observers happen in separate callbacks?
- If there is no filter and two observers happen in the same callback?
- If there is a filter, and the same attribute sets are used? E.g. case 1, but there is a filter.
- What happens if there is a filter and the view says this should be a gauge?
@MadVikingGod could you give some concrete examples? I found that the cases you've described can be interpreted differently. Ideally something like this would help to provide clarity https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/supplementary-guidelines.md#asynchronous-example.
If folks feel this area needs more examples to facilitate the understanding, I can add a section under https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/supplementary-guidelines.md#additive-property to explain the spatial aggregation.
That is the gist of what I was asking about, just more tightly focused on edge cases we have historically found to be non-obvious. Like the behavior in this issue.
If we were to ask "What should be the result in these 4 (+ the original) cases?" I think we would see different interpretations. While I think the supplementary guidelines are useful, they are verbose enough that they can only cover a small set of behaviors. I was thinking more along the lines of a document that easily be translated into tests to help with the validation of these interactions of features. This issue might look like this: https://gist.github.com/MadVikingGod/e3e65d6217d321d5de8ded159f4abc7e
That is the gist of what I was asking about, just more tightly focused on edge cases we have historically found to be non-obvious. Like the behavior in this issue.
If we were to ask "What should be the result in these 4 (+ the original) cases?" I think we would see different interpretations. While I think the supplementary guidelines are useful, they are verbose enough that they can only cover a small set of behaviors. I was thinking more along the lines of a document that easily be translated into tests to help with the validation of these interactions of features. This issue might look like this: https://gist.github.com/MadVikingGod/e3e65d6217d321d5de8ded159f4abc7e
The gist reminded of me https://github.com/w3c/trace-context/tree/main/test, which is a big commitment to introduce and maintain. I feel it is probably a separate topic on its own.
Yes, I've wanted something like the tracecontext interop tests for tracing and metrics so bad. I even started on it at one point... Probably need a SIG for it :)
Currently the specification states:
For asynchronous counter instruments (counter and up-down-counter), how should their sums be reported when this "filter" is defined?
For example, if there is an asynchronous counter that observes
1
for the attribute setversion=1
and then, in the same callback,2
for the attribute setversion=2
and a list of attribute keys equaling{"foo"}
is used in a view, both attribute sets (version=1
andversion=2
) should be ignored according to the specification. Therefore, they both become observations of the empty attribute set. How should their sums be combined? Should the SDK report2
, the last recorded value for the empty attribute set? Or, should it report3
the combination?I had assumed the latter would be expected, but when looking at the python implementation they use the former. This is based on the limit_num_of_attrs.py example:
(caveat, my python understanding is limited so I could have missed something here)
When running this:
The value reported is
2
.cc @jmacd @bogdandrutu @reyang @open-telemetry/python-approvers