jaegertracing / jaeger-client-python

🛑 This library is DEPRECATED!
https://jaegertracing.io/
Apache License 2.0
408 stars 155 forks source link

Python jaeger client generates too long id for spans, that can't be encoded using thrift #341

Closed alvassin closed 3 years ago

alvassin commented 3 years ago

Thrift declares that spanId, traceIdLow and traceIdHigh are 64-bit signed integers. (thrift has no support for unsigned types).

So, i expected that spanId/traceIdLow/traceIdHigh would have bit_length() for such numbers <= 63.

But jaeger python client use unsigned 64-bit integers when creating span:

Finally, It is not possible to submit span using binary thrift protocol, referencing span reported by python client.

I would propose to generate 63 random bits for spanId, traceIdLow and traceIdHigh instead of 64.

Below you can see that if you try to report span generated by python client using binary thrift protocol (or use it or it's trace identifiers) - you'd get struct.error: int too large to convert error for return struct.pack("!q", i64) operation.

"""
pip install jaeger-client requests triftpy2
git clone https://github.com/jaegertracing/jaeger-idl
"""
import time

import requests
import thriftpy2
from jaeger_client import SpanContext
from jaeger_client.codecs import span_context_from_string
from jaeger_client.thrift import _id_to_low, id_to_int, _id_to_high, timestamp_micros
from thriftpy2.utils import serialize

# Span context generated by python client
python_span_id = 'a9f1a68726171ee7:c2de4a7674b6dbc6:0:1'
span_ctx = SpanContext(*span_context_from_string(python_span_id))
print(span_ctx.span_id.bit_length())  # 64
print(span_ctx.trace_id.bit_length())  # 64

jaeger_thrift = thriftpy2.load('jaeger-idl/thrift/jaeger.thrift', 'jaeger_thrift')

batch = jaeger_thrift.Batch(
    process=jaeger_thrift.Process('check-spans-aggregator'),
    spans=[
        jaeger_thrift.Span(
            traceIdLow=id_to_int(_id_to_low(span_ctx.trace_id)),
            traceIdHigh=id_to_int(_id_to_high(span_ctx.trace_id)),
            spanId=span_ctx.span_id,
            parentSpanId=0,
            operationName='example-span',
            flags=1,
            startTime=timestamp_micros(time.time()),
            duration=1,
        )
    ]
)

response = requests.post(
    'http://127.0.0.1:14268/api/traces',
    data=serialize(batch),
    headers={'Content-Type': 'application/x-thrift'}
)
assert response.status_code == 202
yurishkuro commented 3 years ago

Are you facing any specific problem? We handle the signed int conversion in thrift.py id-to-int

alvassin commented 3 years ago

@yurishkuro problem is that span/trace id reported from service to service by jaeger clients is different from span/trace id sent to (and stored in) jaeger collector.

Imagine somebody would like to write a new client for jaeger in some new language. Thrift contract declares that there is signed 64 int. So developer expect to receive serialized 64 signed int from parent span, but receives unsigned one and can't use this span in references for example w/o specific modifications. It is not clear.

yurishkuro commented 3 years ago

Sorry, I don't follow. Jaeger IDs are just 64-bit binaries. We chose to represent them as i64 in Thrift for efficiency reasons. It doesn't matter that i64 is signed because it can still losslessly represent 64bit values, just with a different interpretation when you look at the values as actual ints.

If someone writes a new client, the Jaeger requirement is that it must handle full 64bit values, like all existing official Jaeger clients do.

alvassin commented 3 years ago

Thank you for response