Closed pavolloffay closed 10 months ago
This would also provide the needed dogfooding for OTel SDK.
@yurishkuro completely agree, we are waiting for 2 things:
correlation-context
.Views
- allow us to configure which labels from the correlation-context
to use.If we don't have these two things we need to manually propagate/retrieve these values every time we do a report. Which may be possible if we do have couple of helper functions. So probably the main missing feature is to support changing the default aggregations for some instruments
like for distributions we don't want just minmaxsumcount we want a histogram.
Why do you need context labels in the collector? Sounds a bit like nice to have, not a showstopper.
I think I also tried to explain myself that we have the solution to use helper functions, but let me describe the design a bit:
And as I explained is not a showstopper, the current blocker is the ability to configure the aggregation for the distribution to be histogram vs minmaxsumcount, and probably a nice to have is the ability to select the labels for the aggregator (again not a showstopper).
@pavolloffay to answer your question, we do want to use opentelemetry metrics. probably tracing as well :)
thanks @bogdandrutu. I can open a PR to port the metrics.
Is there any issue in opentelemetry-go
for:
So probably the main missing feature is to support changing the default aggregations for some instruments like for distributions we don't want just minmaxsumcount we want a histogram.
I think the missing thing is this otep https://github.com/open-telemetry/oteps/pull/89
@bogdandrutu I think I remember some recent work that you did on this and changed telemetry to use Otel API instead of OpenCensus. Is that correct? If so I can close this.
Collector is still using OpenCensus in its self-observability pipeline: https://github.com/open-telemetry/opentelemetry-collector/blob/c8aac9e3b294281a50ee183084d9566f0e8a9d29/service/telemetry.go#L25
There is https://github.com/open-telemetry/opentelemetry-collector/pull/1833 switching traces to Otel API (not merged yet), but that still doesn't include metrics.
The scope to resolve this issue is vague. Are we planing to only remove the OC metric reference in public API for unblocking Tracing GA? Because it's hard to replace all OC metrics in all those metrics components in core and contrib repos before tracing GA timeframe.
Need help to elaborate the exit criteria/scope on this issue? And we can contribute for the changes :)
@bogdandrutu since you mentioned that this issue is not a showstopper can it be moved to a post-GA phase 3 backlog?
@bogdandrutu: We will evaluate all the core components for any public API dependence on OC metrics references and respond on issue.
Discussed it in SIG with @tigrannajaryan and @bogdandrutu , this issue is currently blocked by open-telemetry/oteps#89.
This item has a dependency on the Go SDK supporting metrics in a stable state.
cc: @Aneurysm9 @MrAlias
This item has a dependency on the Go SDK supporting metrics in a stable state.
cc: @Aneurysm9 @MrAlias
Understood. For the full dependency map it is important to know we are block on supporting metrics in a stable state by OTel itself having a stable metric specification.
@alolita I assume that this doesn't block the 1.0 release of the Collector, as it's metrics-related?
@djaglowski this is the issue for migrating from OC to Otel. A small progress was made recently on this via https://github.com/open-telemetry/opentelemetry-collector/pull/3816 but the bigger change will be removing the dependency on OC metrics in obsreport. That work is not started.
As far as I know we do not yet have a plan for the custom metrics should be reported (e.g. metric names) either. It will probably happen as part of https://github.com/open-telemetry/opentelemetry-collector/issues/2430
This issue is still block by the otel-go library being ready.
@bogdandrutu https://github.com/open-telemetry/opentelemetry-go/issues/2204 is about OpenCensus bridge, why is it a concern? I think the path would be to rewrite instrumentation to natively talk to OTEL-Go API, not to continue going through OC API.
@bogdandrutu open-telemetry/opentelemetry-go#2204 is about OpenCensus bridge, why is it a concern? I think the path would be to rewrite instrumentation to natively talk to OTEL-Go API, not to continue going through OC API.
@yurishkuro
Since @dashpole provided a mature propose that the metrics migration would use OpenCensus bridge for OpenTelemetry, like trace. https://github.com/open-telemetry/opentelemetry-collector/pull/3816#pullrequestreview-733106917
As an alternative, I would propose that we use both opencensus and opentelemetry client libraries at the same time while we migrate. We can use the OpenCensus bridge for OpenTelemetry to make both opencensus and opentelemetry clients serve metrics on the same prometheus endpoint. We could roll this out with the following steps:
- Switch from the OpenCensus prometheus exporter to a "wrapped" OpenTelemetry prometheus exporter. Verify that self obs metrics on the prometheus endpoint remain the same.
- Add the OpenTelemetry API/SDK and use the same prometheus exporter that OC uses.
- Migrate each metric from OC to OTel individually, and verify that the metric doesn't change.
But this propose is blocked by issue #2204, the prometheus exporter does not satisfy the requirement for OpenCensus metric bridge for OpenTelemetry, which is described in https://pkg.go.dev/go.opentelemetry.io/otel/bridge/opencensus@v0.21.0#readme-metrics
We discussed this a couple of SIG calls ago, and @jaronoff97 was set to check what's our current state by working on a couple of components first and reporting back.
@jpkrohling Our team is currently looking in to this. @paivagustavo has been working on what a migration may look like and has already confirmed that we can provide the same prometheus metrics (and histogram buckets) with the latest Go SDK release. I'll let Gustavo provide some more details.
I've been working on testing the latest released otel go sdk and verifying that it would met the requirements here. As part of this I've instrumented the existing receiver metrics from obsreport.Receiver
(#6222) and the loadbalancingexporter
from contrib, more specifically testing that we could use different histogram buckets for specific metrics. Using the views api, we are able to accomplish that.
About the bridge, as it is in review at the moment (https://github.com/open-telemetry/opentelemetry-go/pull/3207), I would suggest us to start to double instrumenting the collector with otel go in the meantime. If the bridge is merged and released before we finish migrating the current metrics, we can enable it and start removing oc metrics that have already being instrumented with otel-go.
Currently, the collector uses OpenCensus API for internal observability https://github.com/open-telemetry/opentelemetry-collector/blob/0e1b2e323d398cf6cda500953c7896aa01c173b7/observability/observability.go#L25.
These metrics should be probably migrated to OpenTelemetry API.
Jaeger storage components use abstract metrics API and so we need to export these metrics in Jaeger OTEL collector e.g.
https://github.com/jaegertracing/jaeger/blob/bf870d9e2bffaf4758045919446be75ac9fc9783/cmd/opentelemetry-collector/app/exporter/cassandra/exporter.go#L31.
Are there plans to migrate to OpenTelemetry metrics? In Jaeger we want to avoid first creating OC adapter and then OTEL once this repo migrates. I can submit a PR to port the metrics to OTEL API.