open-telemetry / opentelemetry-java

OpenTelemetry Java SDK
https://opentelemetry.io
Apache License 2.0
2.01k stars 839 forks source link

Fix ottracepropagation for short span ids #6734

Closed viveksing closed 1 month ago

viveksing commented 1 month ago

OtTracePropagator drops the parent context if spanId is shorter than 16 bytes while checking spanContext.isValid() . There can be cases when the spanId is 15 bytes and these id's should be padded with 0's to avoid losing parent trace/span. The pr applies this fix.

Example screenshot Screenshot from 2024-09-20 16-48-11

linux-foundation-easycla[bot] commented 1 month ago

CLA Signed

The committers listed above are authorized under a signed CLA.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.06%. Comparing base (de13ff1) to head (30fb76b). Report is 10 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #6734 +/- ## ========================================= Coverage 90.06% 90.06% - Complexity 6457 6464 +7 ========================================= Files 718 718 Lines 19511 19533 +22 Branches 1922 1924 +2 ========================================= + Hits 17572 17593 +21 Misses 1350 1350 - Partials 589 590 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

viveksing commented 1 month ago

hi @jack-berg can you please review this one. Thank you :)

breedx-splk commented 1 month ago

Hey @viveksing -- First, lemme just say that I appreciate this contribution and the fact that this will attempt to improve interop with other tracing systems that might not be behaving entirely correctly.

OtTracePropagator drops the parent context if spanId is shorter than 16 bits

I assume that you just meant bytes here and not bits. :)

I looked at the w3c spec again and now I'm wondering what the scenario is in which you witnessed or somehow created 15 character (15 hex nibbles) span IDs? Do you have any idea of how that was generated?

The w3c spec is pretty clear about saying that trace IDs SHOULD be padded with zeros if generated by something with less precision...but I didn't see anything specifically mentioned about span ids (or parent span ids).

This 15 smells funny to me, and leads me to believe that something out there is purposefully trimming leading zeros from the string representation...and it'd be nice to understand what that something is.

viveksing commented 1 month ago

@breedx-splk thanks for your comments :)

So far I have seen this issue while using skipper proxy before Java applications. It may be the case that skipper is trimming the leading zeros from spanId before propagating them via headers.

I tried testing the issue locally and adding trimmed zero's back in spanId fixes the problem of losing span context.

Thanks for sharing the w3c specs. There was one statement in the section Interoperating with existing systems which use shorter identifiers

Similar transformations are expected when tracing system converts other distributed trace context propagation formats to W3C Trace Context. 

After reading the above line I felt maybe we can try to fix the spanId's too? or sorry if I may have misinterpreted it and the transformations only apply for traceId's?

breedx-splk commented 1 month ago

@viveksing nah I think you're good...I failed to realize that the "OT" here is for OpenTracing, which I guess doesn't use the same W3C propagator, but it's own thing. Sorry if I've added any confusion, this change seems fine to me. Thanks.