open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.8k stars 626 forks source link

sdk/metrics: performance impact of exemplars support even when configured as `always_off` #4243

Open emdneto opened 1 week ago

emdneto commented 1 week ago

After https://github.com/open-telemetry/opentelemetry-python/commit/d5fb2c4189a561bd36186d19923373761d4b3e7a got merged we had a huge drop on iter/sec in metrics SDK benchmark for all metric types:

image

It seems expected when the Exemplars feature is enabled (the default behavior), but according to the spec, setting the exemplar filter to always_off shouldn't introduce any overhead. Running the benchmark locally, we still have the same overhead even when "disabling" (set to always_off) the Exemplars feature.

AlwaysOff An ExemplarFilter which makes no measurements eligible for being an Exemplar. Using this ExemplarFilter is as good as disabling Exemplar feature.

Benchmark Results Summary

Metric before exemplars https://github.com/open-telemetry/opentelemetry-python/commit/a8aacb0c6f2f06bf19b501d98e62f7c0e667fa4c trace_based (main) always_off (main)
Min (µs) 8.3550 13.1880 13.1440
Max (µs) 23.5780 36.9060 94.2870
Mean (µs) 8.9926 14.4266 16.6942
StdDev (µs) 1.0477 2.6945 10.0739
Median (µs) 8.7910 13.8030 13.9150
IQR (µs) 0.2520 0.5620 0.5935
OPS (Kops/s) 111.2024 69.3162 59.9010
Rounds 690 273 277
Iterations 1 1 1
fcollonval commented 1 week ago

Info from analysis by @emdneto and discussed between us on Slack:

I ran a profiler to see what's happening in the code and seems the overhead is in the get_current_span and get_current() function calls I ran the profiler against the main and in a commit before exemplars PR. I can share the results later, but I think the overhead is expected But maybe we can make exemplars feature opt-in and warns about performance overhead

I definitely can help making the exemplars feature opt-in without performance overhead. From the spec, https://opentelemetry.io/docs/specs/otel/metrics/sdk/, If Exemplar sampling is off, the SDK MUST NOT have overhead related to exemplar sampling. ExemplarFilter.AlwaysOff An ExemplarFilter which makes no measurements eligible for being an Exemplar. Using this ExemplarFilter is as good as disabling Exemplar feature. I would propose to ensure that if that filter is set up (possible through env var), no overhead should be added. This will avoid having to add a new flag. What do you think?

Looking a bit more at the changes done, there is gonna be an issue with the get_current calls to add the context to the measurement.

So I would propose to revert the API changes that add the context to Measurement to request the context only when it is needed - aka when the exemplar code is looking for trace and span.

emdneto commented 1 week ago

The problem also is if we have several different aggregations (for example, when generating metrics with dynamic attributes), it will create several reservoirs with several ExemplarsBuckets. On the other hand, each aggregation is created with the reference for theReservoirFactory. Since the benchmark test run with dynamic attributes, it explains the huge drop in metrics-sdk benchmark.

Image of a benchmark test run using EXEMPLAR_FILTER=always_off image

TLDR; The performance is not affected if we have static attributes on the measurement. However, if we have dynamic attributes (which implies creating new aggregations for each combination of attributes), we'll have the same degraded behavior as the benchmark test because of the instantiation of several ExemplarsReservoirs.

So, at least, I expect if the Exemplar's feature is disabled (always_off), there's no need to instantiate a FixedSizeReservoir every time a new aggregation is created. So maybe we can have a NoOpReservoir instead when the exemplar filter is always_off.

aabmass commented 6 days ago

@emdneto when you say dynamic attributes, you mean the benchmark is generating changing attributes for each iteration vs using the same attribute set each time?

So, at least, I expect if the Exemplar's feature is disabled (always_off), there's no need to instantiate a FixedSizeReservoir every time a new aggregation is created. So maybe we can have a NoOpReservoir instead when the exemplar filter is always_off.

This seems like a reasonable optimization. Not sure if it's better, but we could make the FixedSizeReservoir behave more lazily. That profile points to the constructor which eagerly generates it's reservoir https://github.com/open-telemetry/opentelemetry-python/blob/43804f372149b2b7bafd90c880ecaee909d5ec3f/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py#L158-L160

xrmx commented 5 days ago

I would also benchmark this again after #4251 because it fixed always_off to work properly.

emdneto commented 5 days ago

@emdneto when you say dynamic attributes, you mean the benchmark is generating changing attributes for each iteration vs using the same attribute set each time?

So, at least, I expect if the Exemplar's feature is disabled (always_off), there's no need to instantiate a FixedSizeReservoir every time a new aggregation is created. So maybe we can have a NoOpReservoir instead when the exemplar filter is always_off.

This seems like a reasonable optimization. Not sure if it's better, but we could make the FixedSizeReservoir behave more lazily. That profile points to the constructor which eagerly generates it's reservoir

https://github.com/open-telemetry/opentelemetry-python/blob/43804f372149b2b7bafd90c880ecaee909d5ec3f/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py#L158-L160

  1. Exactly! Benchmark tests use different attributes (which are fine and represent the reality of many systems with high cardinality metrics).
xrmx commented 5 days ago

With this I get a bit less than 2x loops in always_off and trace_based running .tox/benchmark-opentelemetry-sdk/bin/pytest --profile opentelemetry-sdk/benchmarks/metrics/ -k 'test_histogram_record_1000[7]':

diff --git a/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py b/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py
index c8fa7f14..12165821 100644
--- a/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py
+++ b/opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/exemplar/exemplar_reservoir.py
@@ -155,9 +155,7 @@ class FixedSizeExemplarReservoirABC(ExemplarReservoir):
     def __init__(self, size: int, **kwargs) -> None:
         super().__init__(**kwargs)
         self._size: int = size
-        self._reservoir_storage: List[ExemplarBucket] = [
-            ExemplarBucket() for _ in range(self._size)
-        ]
+        self._reservoir_storage: Mapping[int, ExemplarBucket] = {}

     def collect(self, point_attributes: Attributes) -> List[Exemplar]:
         """Returns accumulated Exemplars and also resets the reservoir for the next
@@ -173,10 +171,7 @@ class FixedSizeExemplarReservoirABC(ExemplarReservoir):
         """
         exemplars = filter(
             lambda e: e is not None,
-            map(
-                lambda bucket: bucket.collect(point_attributes),
-                self._reservoir_storage,
-            ),
+            (bucket.collect(point_attributes) for _, bucket in sorted(self._reservoir_storage.items()))
         )
         self._reset()
         return [*exemplars]
@@ -201,6 +196,7 @@ class FixedSizeExemplarReservoirABC(ExemplarReservoir):
                 value, time_unix_nano, attributes, context
             )

+            self._reservoir_storage.setdefault(index, ExemplarBucket())
             self._reservoir_storage[index].offer(
                 value, time_unix_nano, attributes, context
             )
@@ -296,6 +292,7 @@ class AlignedHistogramBucketExemplarReservoir(FixedSizeExemplarReservoirABC):
         index = self._find_bucket_index(
             value, time_unix_nano, attributes, context
         )
+        self._reservoir_storage.setdefault(index, ExemplarBucket())
         self._reservoir_storage[index].offer(
             value, time_unix_nano, attributes, context
         )
xrmx commented 5 days ago

@fcollonval please take a look at https://github.com/open-telemetry/opentelemetry-python/pull/4260