Closed kirbysayshi closed 4 years ago
This is the expected behavior. The underlying reason for all of this is that Lightstep only accepts 64 bit trace IDs (which maps to 16 hex characters), but context propagation formats typically allow for 128 bit trace IDs (32 hex characters). We truncate those trace IDs (by taking the least-significant 64 bits / 8 bytes / 16 hex chars) before sending them to a satellite. Typically propagators send the full 128-bit trace ID and the truncation happens later, right before we send data to a satellite. But, for the Lightstep context propagation format, we assume the data is eventually going to a Lightstep satellite, so do the truncation earlier, as part of the context propagation.
The B3 spec dictates that it will propagate what it receives, so if it receives just 16 hex chars, it will propagate those 16 (that's the behavior in the code snippet you linked), or otherwise it will propagate the full 128 bit (32 hex chars) trace ID.
I'm not sure whether there is something wrong in the X-Cloud-Trace-Context header you linked - that looks like a GCP-specific header format (that I don't think could have been emitted by the code here? But perhaps was in a connected service?).
Let me know if this all makes sense or if there seem to be remaining bugs here! I'll be out next week but I added @andrewhsu here to help with any further issues.
@kayousterhout Thank you very much for the clear explanation! I'm sorry I didn't know that Lightstep only used 64 bit trace ids. Not knowing that was very confusing when comparing this implementation to the new OpenTelemetry libraries! I'm glad I know now.
You are also correct that the X-Cloud-Trace-Header is not in the code in this repo; again, my apologies. It's from an internal extension of the Tracer
from this repo that does something like this to construct the header:
public inject(
spanContext: any,
format: string,
carrier: LightstepCarrier,
): void {
LightstepTracer.prototype.inject.call(this, spanContext, format, carrier);
const traceGuid = carrier['ot-tracer-traceid'];
const spanGuid = carrier['ot-tracer-spanid'];
const traceGuidForHeader: string = traceGuid.padEnd(32, '0');
const spanGuidForHeader: string = hexToIntString(spanGuid);
const traceValue = `${traceGuidForHeader}/${spanGuidForHeader};o=1`;
carrier['X-Cloud-Trace-Context'] = traceValue;
}
Since it's using the values placed into carrier
by LightstepTracer.prototype.inject
, I assumed showing the header was a good example to illustrate the most significant bits being dropped (since it's basically just using the values from this library), but forgot that the header was GCP/opencensus. Sorry for the confusion.
Not sure if anyone is still using this library, but I believe I found a bug (unsure of the correct behavior).
If a traceId is 32 characters, then the SpanContext will split it into upper/lower: https://github.com/lightstep/lightstep-tracer-javascript/blob/67019d6773f26dc97495e418818ffe8b08e702b2/src/imp/span_context_imp.js#L51-L54
But when the X-Cloud-Trace-Context header is created for propagation, only the lower section is used since only the private
_traceGUID
is accessed: https://github.com/lightstep/lightstep-tracer-javascript/blob/67019d6773f26dc97495e418818ffe8b08e702b2/src/imp/propagator_ls.js#L25A fix would be to instead use the
traceGUID
function: https://github.com/lightstep/lightstep-tracer-javascript/blob/67019d6773f26dc97495e418818ffe8b08e702b2/src/imp/span_context_imp.js#L31-L33As it is today, this results in headers like:
The zeros are a result of only using the bottom "bits" of the split traceId.
Is there a specific reason why the Lightstep propagator does not use both parts of the trace id? I'm unsure if there is more implicit information there, such as a parent trace, that I don't know about. I did notice that the dd propagator encodes more information in there:
https://github.com/lightstep/lightstep-tracer-javascript/blob/67019d6773f26dc97495e418818ffe8b08e702b2/src/imp/propagator_dd.js#L24-L29
While the B3 propagator also skips some:
https://github.com/lightstep/lightstep-tracer-javascript/blob/67019d6773f26dc97495e418818ffe8b08e702b2/src/imp/propagator_b3.js#L21-L24
OpenCensus' Stackdriver implementation appears to use all the bits:
https://github.com/census-instrumentation/opencensus-node/blob/ef5712fd3b279b0e80494322231232047b06f9e6/packages/opencensus-propagation-stackdriver/src/stackdriver-format.ts#L80-L87
Is this a bug? or is it expected / spec'ed behavior?