DataDog / dd-trace-go

Datadog Go Library including APM tracing, profiling, and security monitoring.
https://docs.datadoghq.com/tracing/
Other
674 stars 438 forks source link

proposal: ddtrace/opentracer: allow alternate behavior for `FollowsFrom` relationship #993

Open knusbaum opened 3 years ago

knusbaum commented 3 years ago

The current FollowsFrom behavior is to create a ChildOf link between the follower and followee. This is not always ideal. from @itaysabato:

Please see the original comment thread for more context. I am proposing one possible solution here.

Problem

Datadog does not support FollowsFrom relationships. Full support for FollowsFrom in the backend/UI is the best option, but we don't know how/when/if they will implement that. If they do, it may just be a tag like we've mentioned before, but it might be something else.

For now, though, we have implemented a FollowsFrom that isn't really a FollowsFrom and in certain cases may cause issues such as those described by @itaysabato.

We are sort of stuck, though. We don't have a complete solution, but we don't want to standardize another workaround that we will then have to support and eventually deprecate.

Solution

One option that looks attractive to me is providing a hook for FollowsFrom so that users can define their own behavior. Then we can provide a solution for any FollowsFrom issues without prescribing a particular solution. If we define a StartOption like this:

// WithFollowsFrom specifies the behaviour for "FollowsFrom" relationships used by the Opentracing
// tracer implementation. It is meant to be used in conjunction with (ddtrace/opentracer).New.
//
// When a "FollowsFrom" relationship is encountered the function is called with the parent as an argument
// and the returned options are added to the span. If unspecified, the default value will be:
//
//    func(parent ddtrace.Span) []tracer.StartSpanOption {
//        return []tracer.StartSpanOption{tracer.ChildOf(parent.Context())}
//    })
//
func WithFollowsFrom(f func(parent ddtrace.Span) []tracer.StartSpanOption) StartOption {
    // ...
}

The current ChildOf behavior could be defined like:

    tracer.Start(
        tracer.WithFollowsFrom(func(parent ddtrace.Span) []tracer.StartSpanOption {
            return []tracer.StartSpanOption{tracer.ChildOf(parent.Context())}
        }),
        )

And @itaysabato's desired behavior could be defined as:

    tracer.Start(
        tracer.WithFollowsFrom(func(parent ddtrace.Span) []tracer.StartSpanOption {
            sctx := parent.Context()
            return []tracer.StartSpanOption{
                tracer.Tag("follows_from", sctx.SpanID()),
                tracer.Tag("follows_from_trace", sctx.TraceID()),
            }
        }),
        )
gbbr commented 3 years ago

This looks good to me. I took the liberty to edit your comment and rename the function argument to parent, along with adding some documentation to the definition. I hope that's ok.

@itaysabato is this solution acceptable to you?

itaysabato commented 3 years ago

This is seems like a reasonable workaround, yes. I also found myself adding span_id and trace_id tags to all spans in order to enable "traversal" of all the linked traces.

gbbr commented 3 years ago

I also found myself adding span_id and trace_id tags to all spans in order to enable "traversal" of all the linked traces.

Wait, since you are already able to do this today (unless I misunderstood you), why do we need any changes? 😄

itaysabato commented 3 years ago

:) I am able to do this because I control all the spans that my app creates, but:

  1. This does not hold for spans created by 3rd parties or the tracer.
  2. I am only faking it, because there is no real FollowsFrom reference created - I would like there to be one, so that any future support will be enabled without rewiring everything again.
gbbr commented 3 years ago

Ok. Fine by me to implement custom behaviour for this type of relationship. So let's consider this proposal accepted.

@itaysabato is there any interest on your end to submit a PR or would you like us to schedule it internally? Any is fine.

gbbr commented 3 years ago

Coming back to this after exploring the code a little bit, it seems it's not as easy as we thought. We won't be able to use a StartOption such as the one proposed here (WithFollowsFrom) because it operates on top of an unexported config structure ((ddtrace/tracer).(*config)) which is not available within ddtrace/opentracer.

We need to explore another way, for example, we could have both WithFollowsFrom as a StartOption and FollowsFrom as a StartSpanOption which the ddtrace/opentracer package would use. Then, a span having FollowsFrom as an option / reference will use whatever is set by WithFollowsFrom. The only concern I have here is that there is a possibility to introduce a programming error such as an infinite loop if WithFollowsFrom ever return FollowsFrom within the slice. I wonder if there's a way to protect against that.

I hope my last paragraph makes sense.

knusbaum commented 3 years ago

@gbbr I don't follow the infinite loop concern. Maybe a code example would help.

It would be nice to avoid adding to the Datadog tracer API since we don't actually support FollowsFrom yet. I'm OK with adding it if we have to, but it's not ideal. Hopefully we can keep this all inside ddtrace/opentracer.

In order to keep this inside ddtrace/opentracer, we may need to add some configuration field(s) to the opentracer struct: https://github.com/DataDog/dd-trace-go/blob/a8c99b56c5bf2545da7e38f4ad97b1da5cfe73b7/ddtrace/opentracer/tracer.go#L42

Then we could configure the WithFollowsFrom when the opentracer starts (maybe through some opentracer-specific start option to opentracer.New)

opentracer.StartSpan could then call the configured FollowsFrom function and append the list of StartSpanOptions to the list it already creates: https://github.com/DataDog/dd-trace-go/blob/a8c99b56c5bf2545da7e38f4ad97b1da5cfe73b7/ddtrace/opentracer/tracer.go#L50

gbbr commented 3 years ago

It would be nice to avoid adding to the Datadog tracer API since we don't actually support FollowsFrom yet. I'm OK with adding it if we have to, but it's not ideal. Hopefully we can keep this all inside ddtrace/opentracer.

I won't bother giving the example, because I agree with this statement. Let's try hard to avoid this if possible, and resort to it only as a last option.

Then we could configure the WithFollowsFrom when the opentracer starts (maybe through some opentracer-specific start option to opentracer.New)

I don't know if that's possible without causing a breaking change. If StartOption would've been an interface, we could've had an opentracer flavour of it, but it's not the case.

Maybe we should simply add a feature flag for this, which causes FollowsFrom references to not be children (and be completely separate traces like it was before), also adding the tags suggested here. Perhaps that's the easiest, most less-intrusive solution? We already have support for DD_TRACE_FEATURES and tracer.WithFeatureFlags. We could call it opentracer_break_follows_from or something shorter 🙂

knusbaum commented 3 years ago

@gbbr I'm ok with that. That definitely does seem less invasive.

Question, though. What's the difference between something that should go in feature flags and something that should be a configuration option?

This seems like it straddles the line into tracer behavior configuration. I'm totally happy to have it as a feature flag, but I'm curious in general how we should decide between feature flag vs configuration, and what compatibility/support guarantees we make about things behind feature flags.

gbbr commented 3 years ago

I would consider it something we do not want to promise compatibility nor support for, because it is enhancing a functionality which Datadog does not technically support, and the approach taken here is not an official one either. It should be considered a beta feature that is subject to change or disappear at any time based on how the product develops and evolves (e.g. we decide to fully support FollowsFrom later, or the hacky approach meant to be used here in search for trace/span IDs will no longer be possible). I don't suspect any of that to happen any time soon, but that's the main view.

I hope that makes sense.

As for the implementation, multiple packages will now have to read the set feature flags, so we'll probably have to move that logic (HasFeature, etc) to globalconfig.