Open rghetia opened 4 years ago
Thank you for following up with this bug @rghetia! For context behind it, over the weekend, I noticed that one of my services of ~30K QPS was erroring out with calls about 10 seconds apart, when writing spans to my backend from various sources, and alas clashing spanIDs were found.
The OpenCensus Specs say that SpanID MUST be globally unique per https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Span.md/#SpanId and so in my schema was relying on this, but for safety should perhaps have the (TraceID, SpanID) combination as the unique key.
We’ll perhaps need to fix this bug because high throughput applications are most definitely having their spans corrupted but their backends might not be enforcing these checks immediately or at least might be asynchronously processing uploaded spans.
Make SpanID random, similar to open-telemetry
/cc @odeke-em