open-telemetry / opentelemetry-java

OpenTelemetry Java SDK
https://opentelemetry.io
Apache License 2.0
1.9k stars 779 forks source link

How to deterministically create a context based on a single id? #6511

Open AskMeAgain opened 3 weeks ago

AskMeAgain commented 3 weeks ago

Hi,

i have 10 ETL pipelines which all run more or less at the same time, each pipeline runs based on the same input data object with a unique id. I would like to have all my etl pipelines contribute to the same trace, so i can basically see how each pipeline worked on the same object and how it got passed through.

The problem: these pipelines all run in parallel, out of order and sometimes in other applications.

What iam looking for is a deterministic way to create a context based on a unique id, since this is the only information i really need to assign spans to a specific trace.

I have currently code like this:

//here i get the unique id and do some conversation to a trace id
var originalArray = dataPackage.getId().getBytes();
var spanId = UUID.randomUUID().toString().substring(0,10).getBytes(); //some random span id as this is needed

var traceIdArr = new byte[16];
var spanIdArr = new byte[16];

System.arraycopy(originalArray, 0, traceIdArr, 16 - originalArray.length, originalArray.length);
System.arraycopy(spanId, 0, spanIdArr, 16 - spanId.length, spanId.length);

//creating the context
var wrap = Span.wrap(SpanContext.createFromRemoteParent(
    TraceId.fromBytes(traceIdArr),
    SpanId.fromBytes(spanIdArr),
    TraceFlags.getDefault(),
    TraceState.getDefault())
);
var otelContext = Context.root().with(wrap);

var startSpan = tracer.spanBuilder(context.getId())
    .setParent(otelContext)
    .startSpan();

My issue is that this just doesnt appear in jaeger. If i just not set the parent it works, but then all the traces are separated by etl pipeline in separate traces

I suspect its because createFromRemoteParent expects that this trace got somehow already created, but that is not possible since i each pipeline runs on their own and out of order

Do you see any issues with my approach?

jkwatson commented 3 weeks ago

An easier approach would be to do: IdGenerator.random().generateSpanId() or IdGenerator.random().generateTraceId() so you are guaranteed to be generating ids that are compatible with the rest of the OTel ecosystem.

And yes, you need to have a root span that is the parent of all the other spans if you want them to be properly linked together in Jaeger. That will be difficult to generate unless you can start/end it using the OTel SDK, rather than relying on an externally generated id. How about adding an attribute to your spans which has the value of dataPackage.getId(), rather than trying to use it as a trace id?