GoogleCloudPlatform / opentelemetry-operations-go

Apache License 2.0
130 stars 100 forks source link

Migrate collector exporter self-observability from OpenCensus to OpenTelemetry #797

Closed dashpole closed 1 month ago

dashpole commented 8 months ago

Using OpenTelemetry for self-observability metrics is now stable in the collector: https://github.com/open-telemetry/opentelemetry-collector/pull/9102. OpenCensus metrics are still supported on the prometheus endpoint: https://github.com/open-telemetry/opentelemetry-collector/blob/7ade1016cf138965685e533f60ec83892d26abc6/service/internal/proctelemetry/config.go#L212, but the OpenCensus bridge is not used if you are exporting with OTLP, which we might want to support in the future.

Our OpenCensus usage:

We define googlecloudmonitoring/point_count and googlecloudmonitoring/exemplar_attachments_dropped in exporter/collector/observability.go using OpenCensus. Migrating this to OpenTelemetry should be straightforward.

We also use ocgrpc in the collector exporter: https://github.com/GoogleCloudPlatform/opentelemetry-operations-go/blob/fda999eac0b4566cd4d32e1792318b7eb41d456d/exporter/collector/config.go#L263. Migrating this to OTel may be tricky, as we would need to either migrate to otelgrpc, or to gRPC's upcoming self-observability metrics: https://github.com/grpc/proposal/blob/master/A66-otel-stats.md. This is likely to be a breaking change, which we will need to migrate with a featuregate.

dashpole commented 8 months ago

cc @braydonk @quentinmit

dashpole commented 1 month ago

The OC bridge has now been removed: https://github.com/open-telemetry/opentelemetry-collector/pull/10406

This means we have now lost any opencensus-based metrics, and recent versions will no longer have these.

We don't need a feature gate, since the old metrics have already been removed (and there is no way for us to bring them back).

dashpole commented 1 month ago

Self-observability metrics have been missing since v0.104.0, and can be re-enabled by setting the service.disableOpenCensusBridge feature gate to false. Starting in v0.106.0, it was updated to stable, and OC metrics can no longer be re-enabled