Open smeubank opened 9 months ago
I tried building this into arroyo (where this IMO can be more effective), and IMO our transactions concept does not map particularly well onto any kafka concepts. there is no affordance for batching in our datamodel or OTEL's
error reporting works fine OOTB
we could still technically inject headers while publishing and continue traces while subscribing though right, or is that bad?
we could still technically inject headers while publishing and continue traces while subscribing though right, or is that bad?
that part works perfectly fine. the issue is that business logic in arroyo is usually written in a way where messages are processed in batches. so one function call processes 200 messages at once.
logically it means that there should be a transaction measuring that function's execution time (normalized by batch size?) -- but there's multiple traceparents to consider
we can make tracing work for consumers who just don't batch (we have a lot of them in sentry), but those tend to not be performance sensitive
i was hoping to integrate arroyo closer with DDM at some point since we already have metrics
Getting a working Kafka Integration would allow us to remove the start_transaction
call in src/sentry/ingest/consumer/processors.py
, which we have decided to keep around in https://github.com/getsentry/sentry/issues/63590, since there is no auto-instrumentation to replace the custom start_transaction
call without a Kafka integration.
Problem Statement
There is no integration in the Python SDK which provides some auto-instrumentation for kafka interactions. Which requires manual instrumentation to be able to capture span information and have OOTB distributed tracing with services communicating
Solution Brainstorm
There is already a redis integration, which is not immediately the same as Kafka but also can behave as a messaging service.
There is also in OTel some Kafka integration. would it be possible to have something similar or re-use their integration in the Sentry SDK?