Open james-callahan opened 6 years ago
I consider uberctx-{key}
scheme to be, in retrospect, a mistake, we could've easily achieved the same goals with uberctx: key=value
format, which is also being proposed for w3c trace context. At Uber we're stuck with the uberctx-{key}
format now as changing it requires upgrading client libs in 1000+ applications, which is ... well, you know. So our internal guideline is that keys can only be alphanum-snake-case
(which in practice is perfectly acceptable).
It doesn't mean we cannot solve this problem in Jaeger, we could either introduce encoding for keys that are not alphanum-snake-case
, or we can implement different codecs for a format that's similar to w3c.
Looking at the w3c spec:
Name starts with the beginning of the string or separator , and ends with the equal sign =. The contents of the name are any url encoded string that does not contain an equal sign =. Names should intuitively identify a the tracing system even if multiple systems per vendor are present.
So the baggage key space is reduced to any string that can be encoded in url encoding (which is all of them?)
%3D
and a comma as %2C
(or are they banned entirely)?I hope the null byte isn't valid but I might have to handle that too: https://github.com/isaachier/jaeger-client-c/blob/master/src/jaegertracingc/key_value.h#L34-L37. Other than C, most languages handle that gracefully.
@isaachier one incompatbility of treating them as 8bit (minus null byte) C strings is that you would allow invalid UTF8; while e.g. javascript would need valid unicode (but allows unpaired surrogates)
I have an encoding method in that code too, but this all assumes the null byte is guaranteed to terminate a string (i.e. no need to maintain length).
Background
Baggage keys are currently specified to be "a string". However baggage often needs to be transmitted in places that don't support a clean string namespace. e.g. when transmitted as an http header (with
uberctx-mykey
) the key cannot contain a colon. This can get more complex if baggage keys go through unicode or case normalisation.Proposal
Either the domain of keys needs to be reduced (e.g. mandate lower-case keys) or the full string-space needs to be called out so that encodings (such as the uberctx http header prefix) don't miss an encoding/escaping step.