Closed sjonpaulbrown closed 2 years ago
The initial work for this has been started here.
Some notes on how this should be handled:
we are currently hacking spans by overriding trace_ids to match entity ids (blockID, transactionID, etc), this is mostly done to circumvent having to deal with passing trace_id s over the libp2p and gossip network and makes it in a way that trace_id is known by nature of the entity.
we also disabled the controllable sampling from trace aggregators and do it based on the ID of the entity and sensitivity flag sets per node level. For example, currently we collect traces for every block that has a blockID starting with a zero byte). This prevents nodes' performance be controllable by trace aggregation agents and other external factors and keeps a balanced way of sampling according to the needs of the protocol.
Seems like OTEL supports overriding both:
This way we can likely make it backwards compatible with the current tracer. Let me try migrating existing code to it.
OpenTelemetry support has landed.
To control new library, feel free to switch from old JAEGER_*
env vars to the new OTEL_EXPORTER_OTLP_TRACES_*
, for example localnet uses following: https://github.com/onflow/flow-go/pull/2823/commits/a97f4167339c436f2d5cb50cf17cc032a829eff1.
Full list of env vars: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.12.0/specification/protocol/exporter.md
(NB! We use gRPC exporter by default, so you need to use gRPC-enabled collector on the other side.)
Re-assigning back to @sjonpaulbrown for verification. (cc: @haroldsphinx)
Problem Definition
We would like to add support for tracing with OpenTelemetry. Currently, we support tracing with Open Tracer, but we would like to support OpenTelemetry so that we can integrate into our standard model for distributed tracing.
Proposed Solution
To add support for OpenTelemetry, we can add a new struct that implements the Tracer interface, and we need to add support for enabling tracing with OpenTelemetry.
Definition of Done