Closed bogdandrutu closed 6 years ago
Other open questions:
Baggage-Size: <count of items>
Baggage0: <key>=<value>
Baggage<n-1>: <key>=<value>
\n
is not legal in the header value, =
above is a reserved char, ,
is a reserved char in HTTP spec, so some encoding is certainly requiredbtw, I am obviously biased towards "baggage" because of OpenTracing, but I think "tags" is way too generic and ambiguous term, while "baggage" is unusual enough to not be confused with anything else.
I do not understand why separation of concerns is not applied in OpenTracing, and I wonder why this pollution is proposed to infect the idea of a trace context spec. What is really needed for vendor interoperability of tracing is a clear definition of how to understand an active trace and how to continue it, preserving the trace. The actual semantics of what this trace is has been subject to the tracing system used.
In my opinion the whole idea of baggage transport is seriously flawed, and I seriously had hoped that this spec will be the remedy, reducing it to the minimal need of understanding a trace and being able to continue it.
If you want a tldr of why I dislike baggage or tags, then here is the focal point: Baggage or Tags lack semantics. They consume bandwidth without explaining anybody what they mean. Because they are "optional" in most specs, this means they are unusable for interoperability. And due to their optional nature they are unusable for control flow. The way it is used now is a vendor specific quirk of tracers, which decide to carry some meaningless key value pairs along, while sending other vendor specific key value pairs to the trace system.
@CodingFabian
I do not understand why separation of concerns is not applied in OpenTracing, and I wonder why this pollution is proposed to infect the idea of a trace context spec.
Instrumentation for tracing consists of 90% of general purpose distributed context propagation, and 10% of the actual tracing. There are initiatives to separate DCP into its own layer (e.g. https://github.com/tracingplane). We discussed it at the inception of OpenTracing (which actually started as "Distributed Context Propagation" project), but decided that we want the API to be easy for tracing first, and simply expose the DCP (baggage) API which would have to be implemented anyway, since tracing cannot exist without DCP.
I think the same reasoning applies here. It's certainly possible to limit the scope of this spec to only the tracing piece, except that there have been already many questions "what about this" which can be easily solved with general-purpose baggage spec.
It is not clear from your comment if you think the notion of general purpose DCP itself flawed, or any specific implementations of it. If it's the former, then I can say this:
Baggage or Tags lack semantics.
Yes, that is the main feature of baggage. Actually, they do have semantics for components that care about specific items. And components that don't care don't need to know the semantics, they only need to pass the data along to the next layer. If every baggage item had semantics that every one of 5000 microservices understood, we'd need to change all 5000 microservices if we want to add one more baggage item, even if only 1% of them actually care about interpreting it.
Added another open question to my https://github.com/TraceContext/tracecontext-spec/issues/13#issuecomment-329980665 above regarding name spacing of baggage keys.
just like RFCs (we we aren't even at the stage of) change a bunch of times, we can expect whatever we do here to change. I would like to follow-up with @CodingFabian to figure out what we can do around this.
For experimentation sake, the Correlation-Context
seems on track to help as long as whatever we hope to achieve with this ends up in some sort of tests (ex maybe some python or other easy-access language tests here in this repo).
bikeshedding the name a bit, I think Correlation-Context
is associated with the trace-context
header, maybe trace-context-meta
isn't a bad choice as it also prefix-matches (assuming it is sent iff trace-context
is). meta is bad, just throwing ideas
Truly just throwing this out there in the spirit of sharing ideas (translation: I'm not sure I like what I'm about to propose, either)...
Another way to look at this is that there's one set of trace context fields that are for tracer-internal purposes, and potentially another set of trace context fields for tracing-user purposes. The "internal" fields are basically what's been discussed elsewhere in this repo: trace-ids, span-ids, and sampling bits. The "user" fields would be things like census-style accounting entities, security tokens, and so on. The latter maps almost 1:1 with the opentracing "baggage" concept, of course, though this is a different way of framing it.
Separating the "user" fields from the "internal" fields seems safer from an interference and isolation standpoint. I wonder (and this is the most questionable part here) if we could use precisely the same format or meta-format for the "user" and "internal" fields – could be an elegant unification. (Though I suspect it won't be compact enough for the tracer-internal state)
@bhs While I like the idea of avoiding potential naming collisions between trace internals and user space, if this gets popular and becomes more standardized, I also see a potential for a third grouping of collisions from frameworks/libraries. That may be something to consider when trying to decide to split or not.
@tylerbenson that's a great point. I still like the idea of segmenting state that's specific to tracer internals... wouldn't want it to get lost or truncated because of overzealous user behavior.
@bogdandrutu Since we have Correlation Context
, and @SergeyKanzhelev suggest for ttl
property. For me, baggages are somthing ttl=infinite. Do we still this separated head?
Maybe you will say, baggages are somthing of application data, Correlation Context
are something about tracer propagation. That is the only possible difference for me, now.
hi everybody,
I'd like to add our point of view at Dynatrace to this discussion, if I may. :-) Generally saying, our tracing concepts are pretty similar, but we also have some "internal tracing data" that is only useful for our tools.
So I absolutely vote for something like the baggage concept to propagate tool-internal information - which mostly consists of various IDs for us. I also kinda like the name bagagge as suggested by @yurishkuro and keeping the "Trace-Context" prefix as @adriancole suggested, would lead to something like "Trace-Context-Baggage" which would sound ok to me.
As the majority of this thread seems to do, I also suggest to separate the headers between "general tracing data" and "internal tracing data" (= baggage). I also suggest to only have one header for all the baggage -> should make it easier to get it through firewalls/ESBs and such if there's only 2 specific headers. In any case, if there are multiple tools/vendors involved in one trace, they must not interfere with one another.
So from my point of view the basic behavior guidelines should be:
I also don't see the need to specify the exact encoding, but general tracecontext rules could be recommended: e.g. use '-' as separator, base16 encoding of numeric values I would just go for something simple like
<tool1_name>=<tool1_bagagge>;<tool2_name>=<tool2_bagagge>;...
and a specific tool's bagagge could look like this (but it's actually up to the tool)
01-127-session42-87e9a6f1 // ';' and '=' may not be used and have to be encoded by tool itself (e.g. via url-encoding)
Of course there should be no tool name collision, but I guess that should be doable. :-)
For a binary representation there should of course also be size information included, which I don't see necessary for the HTTP header representation. I also would not include required version information, that's - again - up to the tool to implement on its own.
Having said all that an actual bagagge HTTP header could IMHO simply look like this:
Trace-Context-Baggage: uber=05-taxi31337-27;dynatrace=01-127-session42-87e9a6f1
@yurishkuro sorry for the Uber sample, that was the first name that came to my mind ;-)
So from my point of view the basic behavior guidelines should be:
if there is bagagge coming in, it has to be passed along tools may modify their own bagagge, but must not touch bagagge from other tools baggage may be dropped, but only if really really necessary
What you are describing is very much like actual Baggage, except simplified quite a bit with text-header encoding. Main points are we are firewalling contexts between tools as demonstrated here https://github.com/JonathanMace/tracingplane-opentracing/tree/master/src/main/baggage cc @JonathanMace
Another question to add:
Suppose I have some async components, or suppose I propagate my baggage in RPC responses. How do I combine a key that has two different values for a tag?
FWIW here's the paper we wrote about baggage: http://cs.brown.edu/people/jcmace/papers/mace2017layered.pdf
The paper's a bit roundabout in its message, but is certainly a good resource for this discussion (feel free to contact me to discuss it, especially if you disagree with any of our assertions)
Also not-quite-as-relevant, here's the upcoming SOSP paper on Canopy, Facebook's e2e tracing system: http://cs.brown.edu/people/jcmace/papers/kaldor2017canopy.pdf
If code matters more than words, I've begun implementing our baggage stuff for other languages (e.g., https://github.com/tracingplane/tracingplane-go/blob/master/atomlayer/api.go is our minimal implementation of TP propagation in Go, at 118 LOC) and instrumenting other platforms (e.g., shamelessly butchering https://github.com/JonathanMace/spring-cloud-sleuth). I'll use the sock shop demo (https://microservices-demo.github.io/) to illustrate some baggage concepts. Also we hacked a goroutine-local context into go (https://github.com/JonathanMace/tracing-framework-go/blob/master/cmd/modify-runtime/modify.go) which is cool.
I believe this issue is addressed by introduction of Trace-Context-Ext
for logger internal fields and Correlation-Context
for user-defined bag of properties. Closing the issue. Please re-open for more discussion
Background:
Open Questions: