getsentry / team-sdks

A meta repository for tracking work across all SDK teams.
0 stars 0 forks source link

Tracing: W3C Trace Context Support #41

Open danielkhan opened 7 months ago

danielkhan commented 7 months ago

While we are moving to a world where we may process, propagate, and ingest pure OpenTelemetry (OTLP) data, we have to reconsider our trace context propagation schema and adopt W3C Trace Context fully.

Project Board

About W3C Trace Context

W3C Trace Context is a standard for trace context propagation. It defines three headers.

Sentry has a very similar trace propagation concept. It defines

We are not using a concept like trace-state.

Problem Description

The problem arises in mixed scenarios where one tier might be instrumented purely with OpenTelemetry.

Mixed Trace scenario

In this example, the web frontend sends sentry trace headers to the Node middleware where no Sentry SDK or propagator is present, which leads to context loss ( a broken trace). There is also a Python backend tier with a Sentry SDK that receives a trace-parent and needs to continue this trace.

Out Of Scope

While we are currently not using the baggage header to the spec, this problem is out-of-scope for this project.

Next steps

A quick TL;DR: How to Implement

Add support continuing a trace from the traceparent header. If both sentry-trace and traceparent headers are present, sentry-trace takes precedence. On outgoing HTTP requests, attach a traceparent header next to the sentry-trace header by default. Users can opt out by setting trace_propagation_tragets to []. The W3C traceparent header only has two sampling states; hence, deferred or not-to-sample decisions are both represented with -00. Add a new global getW3CTraceparent() function as well as Span::toW3CTraceparent().

Get inspired by the PHP SDK PR #1 & #2 and Laravel SDK PR.

Time constraints

We will soon be able to ingest OTel traces and this turns out to be rather critical. So we should have a plan by end of Q4 and implement this in Q1.

Stakeholder(s)

Team(s)

Web-Backend, Mobile, Web-Frontend

### SDK Tasks (General)
- [ ] Develop https://github.com/getsentry/develop/pull/1136
### SDK Tasks (Mobile)
- [ ] https://github.com/getsentry/sentry-cocoa/issues/3598
- [ ] https://github.com/getsentry/sentry-java/issues/3160
- [ ] https://github.com/getsentry/sentry-react-native/issues/3564
- [ ] https://github.com/getsentry/sentry-dart/issues/1831
### SDK Tasks (Web Backend)
- [x] https://github.com/getsentry/sentry-php/pull/1680
- [x] https://github.com/getsentry/sentry-laravel/pull/834
- [ ] https://github.com/getsentry/sentry-go/issues/796
- [ ] https://github.com/getsentry/sentry-java/issues/3160
- [ ] https://github.com/getsentry/sentry-dotnet/issues/3069
- [ ] https://github.com/getsentry/sentry-python/issues/2684
- [ ] https://github.com/getsentry/sentry-ruby/issues/2276
- [ ] https://github.com/getsentry/sentry-elixir/issues/702
### SDK Tasks (Web Frontend)
- [ ] https://github.com/getsentry/sentry-javascript/issues/11171
- [ ] https://github.com/getsentry/sentry-electron/issues/850
cleptric commented 7 months ago

We must find a migration path that lets us first emit sentry-trace and trace-parent in parallel. This can't be turned on by default because of CORS, though

This is only a concern in FE JS, all other platforms can in theory send both headers today.

We also need to ensure that we can pick up an incoming trace-parent header

The question will be what will take precedence for the moment, likely sentry-trace.

danielkhan commented 7 months ago

The question will be what will take precedence for the moment, likely sentry-trace

Yes, sentry-trace it is.

krystofwoldrich commented 5 months ago

React-Native will get this from JS.

ericsampson commented 4 months ago

Hi! I'mvery glad to see this being worked on.

We have a large mixed OTel-and-Sentry microservice system. Is there any docs/guidance on how we can get end-to-end distributed tracing working? (doesn't have to be polished, even just links to GH issues or pointers would be helpful)

I'm also wondering what direction we should take re evolving our current setup, in terms of matching the direction that Sentry is going... Should we be adding the Sentry SDK to the systems that are currently instrumented with Otel, or the reverse and moving to a "fully-OTel but with Sentry exporter" system.

The sense I'm getting is that the Sentry OTel story is still evolving, and so it may be too early for us to move to the latter direction without experiencing a ton of "bleeding-edge paper cuts" ? But at the same time I don't really feeling like doing two migrations (go all-Sentry now and then Sentry-on-OTel later) since there are so many micro services that need touching.

The trace header interior/propagation story is kinda messy in a legacy mixed system like ours. And for more fun, we also have legacy Honeycomb usage (with their proprietary headers) in the mix for some of the services, so there are 3 standards in play 🥴

Thanks!

danielkhan commented 4 months ago

Hi @ericsampson, the direct answer is switching to Sentry SDKs is more beneficial. We're planning to support OTLP ingest this year, but OTel's data often doesn't match Sentry's quality, particularly for errors.

I suggest using our SDKs as we're integrating OTel for its valuable instrumentation. For instance, our Node.js experimental package already utilizes OTel for auto-instrumentation while maintaining compatibility with the OTel tracing SDK and providing Sentry's error tracking.

Throughout this year, we'll introduce more SDKs that integrate OTel. If installing our SDK isn't an option, you'll still be able to send data via OTLP, which is part of the improvements we're discussing here.

ericsampson commented 4 months ago

thanks for the info @danielkhan !

mydea commented 2 months ago

Snippet to generate traceparent with the JavaScript SDKs, should work in both v7 & v8:

import * as Sentry from '@sentry/node';

function getTraceparentString() {
  const span = Sentry.getActiveSpan();
  if (!span) {
    return undefined;
  }
  return`00-${span.spanContext().traceId}-${span.spanContext().spanId}-0${span.spanContext().traceFlags}`;

(for context, traceFlags is either 0 or 1)

brustolin commented 2 months ago

Snippet to generate traceparent with the Cocoa SDKs.

func getTraceParent() -> String? {
        guard let span = SentrySDK.span else { return nil }
        return "00-\(span.traceId.sentryIdString)-\(span.spanId.sentrySpanIdString)-0\(span.sampled == .yes ? 1 : 0)"
    }
markushi commented 2 months ago

Java

public static String getTraceParent() {
  final ISpan span = Sentry.getSpan();
  if (span != null) {
    final SpanContext context = span.getSpanContext();
    return "00-" + context.getTraceId() + "-" + context.getSpanId() + "-0" +
      ((context.getSampled() != null && context.getSampled()) ? "1" : "0");
  }
  return null;
}

Kotlin

fun getTraceParent(): String? = Sentry.getSpan()?.spanContext?.let { context ->
    return "00-${context.traceId}-${context.spanId}-0${if (context.sampled == true) "1" else "0"}"
}