getsentry / sentry-python

The official Python SDK for Sentry.io
https://sentry.io/for/python/
MIT License
1.8k stars 473 forks source link

Kafka Integration #2367

Open smeubank opened 9 months ago

smeubank commented 9 months ago

Problem Statement

There is no integration in the Python SDK which provides some auto-instrumentation for kafka interactions. Which requires manual instrumentation to be able to capture span information and have OOTB distributed tracing with services communicating

Solution Brainstorm

There is already a redis integration, which is not immediately the same as Kafka but also can behave as a messaging service.

There is also in OTel some Kafka integration. would it be possible to have something similar or re-use their integration in the Sentry SDK?

untitaker commented 5 months ago

I tried building this into arroyo (where this IMO can be more effective), and IMO our transactions concept does not map particularly well onto any kafka concepts. there is no affordance for batching in our datamodel or OTEL's

error reporting works fine OOTB

sl0thentr0py commented 5 months ago

we could still technically inject headers while publishing and continue traces while subscribing though right, or is that bad?

untitaker commented 5 months ago

we could still technically inject headers while publishing and continue traces while subscribing though right, or is that bad?

that part works perfectly fine. the issue is that business logic in arroyo is usually written in a way where messages are processed in batches. so one function call processes 200 messages at once.

logically it means that there should be a transaction measuring that function's execution time (normalized by batch size?) -- but there's multiple traceparents to consider

we can make tracing work for consumers who just don't batch (we have a lot of them in sentry), but those tend to not be performance sensitive

i was hoping to integrate arroyo closer with DDM at some point since we already have metrics

szokeasaurusrex commented 5 months ago

Getting a working Kafka Integration would allow us to remove the start_transaction call in src/sentry/ingest/consumer/processors.py, which we have decided to keep around in https://github.com/getsentry/sentry/issues/63590, since there is no auto-instrumentation to replace the custom start_transaction call without a Kafka integration.