Closed mladedav closed 2 months ago
My preference, in general, is not to add things related to external distributed tracing systems to the core tracing
data model. This feels like it complicates things unnecessarily when just logging in-process spans --- subscribers will now always need to handle the case that a parent span ID might not refer to a local span, even if distributed tracing is not being used.
And, I think it's preferable for a span with a remote parent to also be able to have a separate in-process parent. For instance, suppose we have a server accept loop with a span, and inside of that span, we create a child span for each request, potentially setting a remote parent ID on those requests based on headers. We would like that span to be able to be a child of the server's accept loop, as well as recording that it's a child of the remote span in the distributed tracing system.
As an alternative, what do you think of an approach where a remote parent span ID is recorded as a field when the span is created? This avoids the issue of attaching a remote parent to a span later than when it's constructed, but doesn't add a special case for remote span IDs in the core tracing
data model. I'd be open to adding new field types to better support this --- it might be good to support &[u8]
byte arrays as fields, so that distributed trace IDs can be recorded without formatting them as strings. I'd happily merge a PR adding byte slices as a primitive field type, which I think would benefit this use case and many others.
Sure, that also works. I tried to keep an invariant that each span has just one parent either local or remote, but not multiple ones.
By fields you mean something like info_span!("name", ?foo, %bar, remote.id = "...")
where the Subscribe
implementations would just look for the expected field names, right? I'd rather avoid magic fields if I can help it but yeah.
I guess the end goal would be to use valuable
for this. But until then I guess I'll look if I can make the byte slice work (although for the specific usecase of OpenTelemetry I guess &str
would be enough when the context is propagated through HTTP headers and when one already has the decoded IDs, using a pair of u128
and u64
would be better).
By fields you mean something like
info_span!("name", ?foo, %bar, remote.id = "...")
where theSubscribe
implementations would just look for the expected field names, right? I'd rather avoid magic fields if I can help it but yeah.
In general, we've tried to adhere to the philosophy that data specific to a particular external observability system, such as OpenTelemetry trace and span IDs, or data related to metric collection from tracing
spans and events, should be recorded as "magic fields", rather than represented directly in the core tracing
structures. The goal is that information that's only used by a particular subscriber implementation shouldn't be part of tracing-core
and should instead be communicated by a field or set of fields that the subscriber implementation documents.
I get that it feels a bit...stringly-typed, to use magic field names, but it has the advantage that it helps us keep the core libraries as general-purpose and minimal as possible, which I think is valuable.
Sure, I get it. Thanks for the explanation, I'll keep that in mind going forward.
If it ever bother me enough, I can always create a macro wrapping tracing
macros that forces the types to be checked at compile time and makes it impossible for the strings to be mistyped.
Motivation
Currrently, only locally tracked spans can be used as a parent span. However, in distributed systems, the parent is actually often a span originating in another service and this relationship is desirable to be tracked.
For example, a server handling a request may have received a header specifying an ID of a remote span of the client making this call and some information from this should be saved and be available for subscribers to use.
The main example of this would be OpenTelemetry, which propagates globally unique trace and span IDs by which different spans from different systems are grouped into a single trace.
Currently,
tracing-opentelemetry
exposes a method that allows a span to to add the information about the remote parent after the span is already created. This is suboptimal because the span may be entered before this information is provided and it can be even changed later on. It would be instead better to force the user to provide the information during span creation time and not afterwards.Solution
Parent ID was tracked as
Option<span::Id>
which was changed to a custom enum with three variants, one equivalent to the formerOption::None
for root spans, one equivalent toOption::Some(id)
for locally tracked spans which can be looked up in the sameCollect
that tracks the child span (if it implementsLookupSpan
) and a new variant for remote spans.This variant wraps a trait object that can be then downcasted by
Subscribe
implementations. This is becausetracing
should not force users to use any representation for their remote spans.One possibly major change this creates is that
Span::child_of
now takes an argument ofT: Into<ParentId>
, where it used to takeT: Into<Option<Id>>
. All implementations that were provided before were added, but this may break some users which may have implementedInto<Option<Id>>
for their own types.TODO
These things should be done before merging but can wait until I get a greenlight on the current implementation.
no_std
support -- Since the remote ID variant usesBox
, this only works withalloc
. I will add conditional compilation as needed. tests -- all current tests work but of course they don't test for remote contexts at all. docs -- new public API is documented but I will have to look through all current documentation for mentions ofOption<Id>
and possibly add some explanation in the root of the crate and other places.