kube-rs / controller-rs

A kubernetes reference controller with actix-web
Apache License 2.0
282 stars 27 forks source link

trace and span IDs are invalid (zero) #44

Closed aharbis closed 1 year ago

aharbis commented 1 year ago

The TraceId returned by telemetry::get_trace_id is invalid (zero). This is also the case for the SpanId. I suspect it's something to do with the integration between tracing and opentelemetry (in src/telemetry.go):

///  Fetch an opentelemetry::trace::TraceId as hex through the full tracing stack
pub fn get_trace_id() -> TraceId {
    use opentelemetry::trace::TraceContextExt as _; // opentelemetry::Context -> opentelemetry::trace::Span
    use tracing_opentelemetry::OpenTelemetrySpanExt as _; // tracing::Span to opentelemetry::Context

    tracing::Span::current()
        .context()
        .span()
        .span_context()
        .trace_id()
}

Running locally against my Kubernetes cluster, with the lorem example doc created, the trace_id value is always zero:

2023-03-06T03:00:22.530885Z INFO reconciling object{object.ref=Document.v1.kube.rs/lorem.default object.reason=object updated}:reconcile{trace_id=00000000000000000000000000000000}: controller::controller: Reconciling Document "lorem" in default

Adding a getter and Field for span_id shows similar:

2023-03-06T04:12:32.778508Z INFO reconciling object{object.ref=Document.v1.kube.rs/lorem.default object.reason=object updated}:reconcile{trace_id=00000000000000000000000000000000 span_id=0000000000000000}: controller::controller: Reconciling Document "lorem" in default

clux commented 1 year ago

Ugh, that has happened before. I think it is a symptom of having version mismatches (e.g. multiple versions) in certain parts of the otel ecosystem.

I'll have a look in a bit. Thanks for raising.

clux commented 1 year ago

This is actually invalid. You will get non-zero TraceId from the controller provider you run with telemetry features:

just forward-tempo & # see below
just run-telemetry

this does require you having a valid place to point the exporter at (via OPENTELEMETRY_ENDPOINT_URL evar) because we cannot initalize the tracer without that (and tonic will try to connect to that), and without a tracer (or without calling get_trace_id from inside a instrumented context), the traceids are always zero.