Closed mabdinur closed 3 months ago
Profiling tests are failing due to 64bit unsigned span_ids being converted to signed integers.
@marcotc Is this a bug in the profiler?
Profiling tests are failing due to 64bit unsigned span_ids being converted to signed integers.
I don't see any changes in your PR that should affect it, can you figure out what line exactly in your PR is triggering the issue?
Profiling tests are failing due to 64bit unsigned span_ids being converted to signed integers.
I don't see any changes in your PR that should affect it, can you figure out what line exactly in your PR is triggering the issue?
In the failing test the span_id of 13756630617282998475
is encoded as -4690113456426553141
. Both numbers have the same 2s complement representation 1011111011101001010111101010100100100001011010100111000011001011
.
It seems like the profiler is using signed 64bit integers to correlate spans to threads. However I can't find where this occurs. Looking at the thread collector's implementation it seems like the span_id should be an unsigned integer: https://github.com/DataDog/dd-trace-rb/blob/master/ext/datadog_profiling_native_extension/collectors_thread_context.c#L158. The ruby otel tracer generates span_ids up to 2^64 while the datadog ruby tracer generates span ids up to 2^63 so this wasn't an issue in the past.
cc: @ivoanjo
Update
I think I found the issue. LabelValue.num stores integers as i64. This is where the unsigned 64bit span_ids are miscast to signed integers: https://github.com/DataDog/dd-trace-rb/blob/5a95611148ceb8744c9692c6bc3f57e66094a970/ext/datadog_profiling_native_extension/collectors_thread_context.c#L741-#L742.
Taking a look, thanks for the ping!
I've pushed a fix in a99f0e51af09d45e57a851640c7407fdb2aa7f80 . Sorry for breaking your PR! I remember thinking "I should correct for the unsigned thing in our tests..." when working on #3466 but clearly ended up not doing it.
As I stated in the commit message -- both the profiler code and the backend know to treat these values as unsigned, and it was really only the test code that also decodes profiles to match on them that wasn't doing the same adjustment.
LMK if I can provide any more help :)
I've pushed a fix in a99f0e5 . Sorry for breaking your PR! I remember thinking "I should correct for the unsigned thing in our tests..." when working on #3466 but clearly ended up not doing it.
As I stated in the commit message -- both the profiler code and the backend know to treat these values as unsigned, and it was really only the test code that also decodes profiles to match on them that wasn't doing the same adjustment.
LMK if I can provide any more help :)
It's good to know span_ids are represented as unsigned integers in the backend. This fix looks good to me. Thanks for looking into this.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 98.11%. Comparing base (
e086475
) to head (a99f0e5
). Report is 21 commits behind head on master.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@marcotc This PR should be good to go
What does this PR do?
Motivation:
Additional Notes:
How to test the change?
Unsure? Have a question? Request a review!