Open xrmx opened 1 day ago
Looking at the console export the only None I have are from the exemplars, e.g.:
"exemplars": [
{
"filtered_attributes": {},
"value": 24,
"time_unix_nano": 1730893479734229622,
"span_id": null,
"trace_id": null
}
]
I guess this would be the fastest workaround, the proper one would be to export Exemplar without these span_id and trace_id if they are not available instead:
diff --git a/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py b/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py
index 1dcbfe47..0e3e4016 100644
--- a/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py
+++ b/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py
@@ -104,8 +104,10 @@ class ExemplarBucket:
span_context = span.get_span_context()
self.__span_id = span_context.span_id
self.__trace_id = span_context.trace_id
-
- self.__offered = True
+ self.__offered = True
+ else:
+ # we cannot serialize invalid spans so stop offering them
+ self.__offered = False
def collect(self, point_attributes: Attributes) -> Optional[Exemplar]:
"""May return an Exemplar and resets the bucket for the next sampling period."""
@fcollonval What do you think?
I guess this would be the fastest workaround, the proper one would be to export Exemplar without these span_id and trace_id if they are not available instead:
Is this happening because tracing SDK hasn't been set up? I would expect to drop the the exemplar in this case but need to dig into the spec
@xrmx yes the trace_id and span_id should not be exported if unset. This is actually fixed in https://github.com/open-telemetry/opentelemetry-python/pull/4178/files#diff-69723411a82d22ec56b486c8c744df22b6a4f245d68f057e17263d04d17b3dbe. Unfortunately that PR was not reviewed in time for the latest release.
@fcollonval thanks, next time if you are aware of bugs please share it with us, it's not so evident if they are fixed in PRs that starts with "Add" and not "Fix" :sweat_smile:
Sure - I'll do that.
@fcollonval shouldn't the SDK be dropping the exemplar before it even gets to the exporter? https://opentelemetry.io/docs/specs/otel/metrics/sdk/#tracebased
@fcollonval shouldn't the SDK be dropping the exemplar before it even gets to the exporter? https://opentelemetry.io/docs/specs/otel/metrics/sdk/#tracebased
Indeed @aabmass you are right, the default exemplar filter should not offer measurement as exemplar if the current span is not sampled:
So indeed there seems to be another bug here. I'm looking at it.
Ok I can reproduce the issue with the example in docs/examples/metrics/instruments/example.py
When I print the measurement with the should sample flag I get:
False Measurement(value=1, time_unix_nano=1730904901413115349, instrument=<opentelemetry.sdk.metrics._internal.instrument._Counter object at 0x7f74a8705f70>, context={}, attributes=None)
False Measurement(value=1, time_unix_nano=1730904901413315139, instrument=<opentelemetry.sdk.metrics._internal.instrument._UpDownCounter object at 0x7f74a8848590>, context={}, attributes=None)
False Measurement(value=-5, time_unix_nano=1730904901413404019, instrument=<opentelemetry.sdk.metrics._internal.instrument._UpDownCounter object at 0x7f74a8848590>, context={}, attributes=None)
False Measurement(value=99.9, time_unix_nano=1730904901413468105, instrument=<opentelemetry.sdk.metrics._internal.instrument._Histogram object at 0x7f74a8883e60>, context={}, attributes=None)
True Measurement(value=1, time_unix_nano=1730904901413655541, instrument=<opentelemetry.sdk.metrics._internal.instrument._ObservableCounter object at 0x7f74a8c77440>, context={}, attributes={})
True Measurement(value=-10, time_unix_nano=1730904901413794618, instrument=<opentelemetry.sdk.metrics._internal.instrument._ObservableUpDownCounter object at 0x7f74a8d086e0>, context={}, attributes={})
True Measurement(value=9, time_unix_nano=1730904901413888022, instrument=<opentelemetry.sdk.metrics._internal.instrument._ObservableGauge object at 0x7f74a88485c0>, context={}, attributes={})
So there is definitely another bug for observable instrument.
@fcollonval If you haven't already added them you can cherry-pick otlp metrics tests updated with exemplars from here https://github.com/xrmx/opentelemetry-python/commit/7a36010c14712dbf283d4f41e20e6d725a7ff96a
Describe your environment
OS: (e.g, Ubuntu) Python version: (e.g., Python 3.8.10) Python 3.10 Package version: 1.28.0
What happened?
It looks like there is a regressions around metrics exporting
Steps to Reproduce
Please note that this is sending data to a collector, serializing to the console is working fine:
Expected Result
Metrics exported
Actual Result
Additional context
No response
Would you like to implement a fix?
Yes but I need to dig