Open yurishkuro opened 1 week ago
For me it looks like a correct usage of OTel Metrics SDK.
To summarize my findings in https://github.com/open-telemetry/opentelemetry-go/pull/5544.
We should take a close look at the exemplar reservoir performance when exemplars are disabled. It currently makes up a substantial (~50%) portion of the overhead for the no-attributes case.
This appears to be because of the time.Now() call for each measurement. We should at least consider moving the time.Now call into the exemplar reservoir so that it is only invoked when we are actually recording an exemplar.
I also found that the benchmark did not change if I swapped out the OTel prometheus exporter with a manual reader (which is expected). I'm removing the prometheus exporter label.
https://github.com/open-telemetry/opentelemetry-go/pull/5545 is a ~45% performance improvement for the zero-attributes case, and a ~20% performance improvement for the single-attribute case.
Description
Jaeger is in the process of migrating away from Prometheus SDK towards OTEL SDK. We're currently blocked by a massive performance degradation, as illustrated by this benchmark https://github.com/jaegertracing/jaeger/pull/5676. Are we not using OTEL SDK correctly? We're seeing 10-25x slowdown compared to Prometheus SDK.
Environment
Steps To Reproduce
https://github.com/jaegertracing/jaeger/pull/5676
Expected behavior
Expecting to see counter bumps to be in the ballpark with Prometheus counters.