Closed npepinpe closed 1 year ago
We decided to trace all commands in the system. We tried two different approaches:
Our conclusion is the second approach is best for us, as we're mostly interested in tracing commands. While it would be great to trace a complete process instance lifecycle, neither approach allows this easily (not without keep span contexts in the state). The flattened approach should anyway provide us with all the relevant information we need.
In order to propagate context across all boundaries (both threads and processes), our approach was to use the built-in TextMap
propagator for the gRPC part, and write our own custom serialization code in SBE for internal usage.
The SBE stuff was written in its own schema, though we did not do baggage propagation yet.
The span context would then be serialized from the gateway to the broker as part of the ExecuteCommandRequest
(this could also be applied to other such commands by writing the span context as a nested varDataEncoding
property). For cross thread boundaries, we would serialize it then on the command in its RecordMetadata
, again as a nested field.
Note One caveat, in many places we reuse
RecordMetadata
instances, so be careful when passing the span context directly from it and make sure it's immutable!
Adding the span context to the record metadata allowed us to quickly deal with batching/aggregation of commands at certain points. For example, in the Raft thread, when appending or replicating an entry, you could quickly check if the data writer (e.g. BufferWriter
) of the application entry is an instance of SbeContextProvider
, then iterate of these and create a fresh span for each. Something like:
final List<Span> spans = new ArrayList<>();
if (data instanceof SpanContextProvider provider) {
provider
.spanContexts()
.forEach(
sc -> {
final var spanBuilder =
tracer.spanBuilder("appendEntry").setSpanKind(SpanKind.SERVER);
if (sc.isValid()) {
final var parentContext = Context.current().with(Span.wrap(sc));
spanBuilder.setParent(parentContext);
spanBuilder.setAttribute("partitionId", String.valueOf(raft.getPartitionId()));
spans.add(spanBuilder.startSpan());
}
});
}
raft.getThreadContext()
.execute(
() -> {
try {
safeAppendEntry(
new UnserializedApplicationEntry(lowestPosition, highestPosition, data),
appendListener,
spans);
} finally {
CloseHelper.quietCloseAll(spans.stream().map(s -> (AutoCloseable) s::end).toList());
}
});
This is quite ugly, and I hope we can find a more convenient way of doing it, but I think the concept will be similar (i.e. iterate over a bunch of aggregated span contexts, and create a span for each, then close all at the end).
Open questions would be:
Closing, we can open it again when we're working on this. @megglos feel free to add anything I forgot.
Description
This issue is mostly to document the outcome of our hack week where we set up distributed tracing in order to trace commands in the system.
This is not an issue which will describe what we should do, but rather what we did, open questions, etc. We should refer to it again when we're approaching this as a feature.