getsentry / sentry-dotnet

Sentry SDK for .NET
https://docs.sentry.io/platforms/dotnet
MIT License
566 stars 203 forks source link

Allow OpenTelemetry child spans for Sentry Transactions #3288

Closed jamescrosswell closed 2 weeks ago

jamescrosswell commented 1 month ago

Resolves https://github.com/getsentry/sentry-dotnet/issues/3159

Caveats

It's currently not possible to determine the best parent for OTel spans when mixing OTel and Sentry instrumentation.

As an example, someone might instrument their code as follows:

using var main = activitySource.StartActivity("Main")
{
  var wrapper = transaction.StartChild("Wrapper");

  using (var child = activitySource.StartActivity("Child"))
  {
    ...
  }

  wrapper.Finish();
}

Looking at how the code is structured, perhaps what the SDK user wants is for the child.Parent to be wrapper. However consider the code snippet below:

using var main = activitySource.StartActivity("Main")
{
  var wrapper = transaction.StartChild("Wrapper");

  using (var child = activitySource.StartActivity("Child", ActivityKind.Internal, main!.Id))
  {
    ...
  }

  wrapper.Finish();
}

In this second code snippet child.ParentId has been set explicity to main.Id so it's clear that the SDK user wants main to be the parent.

Both code snippets above result in the same inputs to the SentrySpanProcessor, which makes it impossible for us to know what the parent really should be.

A similar problem arises if child is created explicitly as a new Root span.

The only way to resolve this would be to leverage OTel instrumentation under the hood in the Sentry SDK itself... but that would imply an additional dependency on System.Diagnostics.DiagnosticSource for anyone targeting net5.0 and earlier (including .NET Framework), as we discovered in https://github.com/getsentry/sentry-dotnet/pull/3238.

smeubank commented 1 month ago

Allow OpenTelemetry child spans for Sentry Transactions

exciting stuff! I am just curious from the title.Does this mean that sentry transactions are always the parent span? Is there ever a case where we construct a "Sentry Transaction" based upon entirely OTel spans with OTel parent span?

jamescrosswell commented 4 weeks ago

Does this mean that sentry transactions are always the parent span?

Hey @smeubank, Sentry transactions won't always be the parent span no.

If you've using Sentry's OpenTelemetry integration in ASP.NET Core, for example, an OpenTelemetry span gets created for each http request before the middleware pipeline gets kicked off - so the "transaction" (i.e. span with no parent) in that case is going to be an OTel span. You could use the Sentry.DiagnosticSource integration to then capturing child spans for SqlClient or Entity Framework operations. This is something that's been possible for quite some time.

The opposite (an OTel span inside a Sentry transaction) wasn't possible prior to this PR however.

Is there ever a case where we construct a "Sentry Transaction" based upon entirely OTel spans with OTel parent span?

If you're using the Sentry.OpenTelemetry package then you could perform all of your trace instrumentation using ActivitySource, yes. That has been possible ever since the Sentry.OpenTelemetry integration was released.

Is that something you've been wanting to do? Is that just to avoid vendor lock in or some other reason?

jamescrosswell commented 3 weeks ago

And the wrapping span ends up as a proper child of the Main activity but should also be parent of Task 2.

Hm, that's kind of the normal mechanics of OTel tracing. When you create an activity, the parent of that activity is inferred fromActivity.Current (from the System.Diagnostics.DiagnosticSource namespace).

I'm wondering how we could infer whether the parent should be the current sentry span or the current activity in that scenario, without getting into trouble. We could use the start date on SentrySdk.GetSpan vs Activity.Current to try to infer which one should be the parent... but is that really what should happen? I can imagine a scenario like this for example:

Request:
  -> Start Transaction for request (e.g. OTel)
     -> Request Handler
        -> Call some async functionality instrumented using OTEL
        -> Start Span manually (using Sentry instrumentation)
           -> Outbound integration has some further instrumentation (e.g. Database stuff)
        -> Do some work in the request handler (this is what is supposed to be instrumented by the manual span)
        -> Outbound request completes
        -> Finish manual span

In that scenario, I think the "further instrumentation" (the database stuff) would end up as a child of the manual sentry span (just because of the order of async execution)... when actually it should be a child of the otel instrumentation.

Or would it? Are we maybe starting to get into the whole hubs and scopes discussion that the JS guys ended up in?

I'll play around with some things here locally and see how this behaves with the async local scope manager.

jamescrosswell commented 3 weeks ago

@bitsandfoxes after a bit of ruminating, it doesn't look like there's any way to solve for the scenario you described without potentially breaking the trace hierarchy in other senarios... see Caveats that I just added to the PR description.

I don't think we'll get seemless integration with OTel without a major version bump then (and a new dependency for SDK users targeting net5.0 and earlier).