Trace propagation not working on async python with custom instrumentation

rodolfoBee commented 2 weeks ago

How do you use Sentry?

Sentry Saas (sentry.io)

Version

2.15.0

Steps to Reproduce

The issue is happening in a custom Python framework consisting of:

microservices running FastAPI instance
aiokafka consumer and producer as async tasks

All run in the same process as separated python tasks (create_task). The tracing instrumentation is done using custom instrumentation, and custom trace propagation. Simple example available here: https://github.com/antonpirker/testing-sentry/blob/main/test-asyncio/main.py

The goal is to have a trace for each message across all services.

Additional question: Python SDK creates a span for each task create with asyncio.create_task, but does this also create a new local scope for each task? Scope management might be the root cause, as the sample uses get_current_scope and get_isolation_scope .

Expected Result

All transactions created for a message going through services have the same traceID allowing sentry to build the complete distributed trace on the server side.

Actual Result

Each transaction has its own traceID, breaking the distributed trace.

gigaverse-oz commented 1 week ago

Thank you very much for the example and for taking the time to review this issue.

I'd like to clarify the expected results and the complexity to help refine the question.

Consider the following flow per message across two services, similar to the provided example:

INPUT → { microservice1: [consumer 1 → producer 1] ---KAFKA MESSAGE---> [microservice2: consumer 2 → producer 2] } → OUTPUT

Each microservice (on different machines) is marked by [ ], and the full trace is enclosed in { }. We are aiming to obtain a unified trace across all services for each message.

Currently, for single-message processing, we successfully achieve the expected trace with the following:

TRACEPARENT and BAGGAGE are included in the Kafka message to propagate the trace, using sentry_sdk.continue_trace to link traces.
Instrumentation is done for each microservice, using asyncio.create_task for concurrent task execution.

Example Code Template:

# Microservice 1 example
async def process_message(self, input_message):
    with sentry_sdk.start_transaction(source=source, *args, **kwargs):
        # Process the message and create tasks
        # Send Kafka message with TRACEPARENT and BAGGAGE

# Microservice 2 example
async def process_message(self, input_message):
    # Continue the trace using sentry_sdk.continue_trace
    with sentry_sdk.start_transaction(
            sentry_sdk.continue_trace(
                input_message.event_trace_details,
                source=source,
                *args,
                **kwargs,
            )
        ):
        # Process the message and create tasks
        # Send Kafka message with TRACEPARENT and BAGGAGE

Main Question:

If we begin processing multiple messages concurrently (e.g., while processing input 1, microservice 1 receives inputs 2 and 3), will the tracing framework properly manage separate scopes for each message?

Will traces for each input remain distinct?
When calling get_current_scope within an active with clause, will the framework maintain correct scope handling for each async task?

Your insights into how Sentry manages scopes in this concurrent async context would be greatly appreciated.

szokeasaurusrex commented 1 week ago

@gigaverse-oz, I am a bit confused by your message. Are you still experiencing the problem described by @rodolfoBee, or are you describing a separate problem in your comment? Also, could you please clarify the difference between what you say is already working, and what you are asking about in the "Main Question"?

gigaverse-oz commented 1 week ago

Hi @szokeasaurusrex,

Thank you for the follow-up. I apologize for any confusion; I was aiming to clarify the problem.

The main difference between “what is already working” and the “Main Question” is about concurrency. Currently, processing works as expected for a single message at a time, where all async tasks are tied to that one message.

The “Main Question” relates to handling concurrency across multiple messages. Specifically, if the microservice begins processing multiple incoming messages simultaneously, will Sentry’s scope management correctly separate and handle the scopes for each message? Each message’s processing involves multiple asyncio.create_task calls (e.g., data fetching, analysis, transformations), and now we want to process multiple messages concurrently with multiple async tasks for each one.

If it would help clarify, I’d be happy to share the code and discuss further in a quick call. Thank you again for your time!

szokeasaurusrex commented 1 week ago

The “Main Question” relates to handling concurrency across multiple messages. Specifically, if the microservice begins processing multiple incoming messages simultaneously, will Sentry’s scope management correctly separate and handle the scopes for each message? Each message’s processing involves multiple asyncio.create_task calls (e.g., data fetching, analysis, transformations), and now we want to process multiple messages concurrently with multiple async tasks for each one.

Have you tried running the microservice with concurrency? I expect this might work, but I am unsure. I would recommend trying out the code and seeing whether it works. If it doesn't, we can look into what changes we would need to make to fix it.

gigaverse-oz commented 1 week ago

Alright, I’ll get started on that. There are quite a few changes to make, so I wanted to confirm Sentry support before diving into the heavy lifting.

I’ll keep you updated—it may take a few days to a couple of weeks.

getsentry / sentry-python