Opt-in Trace Context Propagation

AlexCasF commented 3 weeks ago

Feature Description

Add support for opt-in trace context propagation in the Python PubSub client, allowing users to maintain distributed tracing across async message boundaries when appropriate.

Use Case

When building distributed systems with multiple microservices communicating via PubSub, there are scenarios where maintaining trace context across async boundaries is valuable:

Workflow tracking where messages are part of a sequential process
Debugging specific message flows
Service-to-service communication where maintaining context is important

Proposed Solution

Add new options to "PublisherOptions":

class PublisherOptions:
enable_trace_propagation: bool = False # Default to False
trace_propagation_mode: TracePropagationMode = TracePropagationMode.METADATA

The client could then handle trace context propagation at the transport layer rather than requiring users to manually add trace context as message attributes - which works for continuing the trace with the subscribing microservice, but will lead to a new empty trace being created in parallel.

Important Considerations

Opt-in by default to avoid unexpected behavior in fan-out/fan-in scenarios
Clear documentation about appropriate use cases
Performance implications should be documented
Complexity warnings for fan-out/fan-in scenarios

Additional context

This feature would align with OpenTelemetry best practices while respecting the unique challenges of async communication patterns.

mukund-ananthu commented 2 weeks ago

AlexCasF

rather than requiring users to manually add trace context as message attributes

To clarify, the Pub/Sub client library implicitly and automatically adds message attributes that are used for trace context propagation. These attributes are prefixed with googclient_ . These attributes are not expected to be set manually by the user of the client library.

which works for continuing the trace with the subscribing microservice, but will lead to a new empty trace being created in parallel.

I'm unclear on what your exact use case is, but have you tried to extract the trace context from the incoming request, using the context to create a parent span, and then making a call to the Pub/Sub client library, in which case the spans created by the Pub/Sub library would have the span you created before invoking the client library as its parent. I was able to get outside_span here to be the parent of the {topic_id} create span that the Pub/Sub client library generates with this code:


from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    BatchSpanProcessor,
)
from opentelemetry.exporter.cloud_trace import CloudTraceSpanExporter
from opentelemetry.sdk.trace.sampling import TraceIdRatioBased, ParentBased

from google.cloud.pubsub_v1 import PublisherClient
from google.cloud.pubsub_v1.types import PublisherOptions

# TODO(developer)
topic_project_id = "your-topic-project-id"
trace_project_id = "your-trace-project-id"
topic_id = "your-topic"

# In this sample, we use a Google Cloud Trace to export the OpenTelemetry
# traces: https://cloud.google.com/trace/docs/setup/python-ot
# Choose and configure the exporter for your set up accordingly.

sampler = ParentBased(root=TraceIdRatioBased(1))
trace.set_tracer_provider(TracerProvider(sampler=sampler))

# Export to Google Trace.
cloud_trace_exporter = CloudTraceSpanExporter(
    project_id=trace_project_id,
)
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(cloud_trace_exporter)
)

# Set the `enable_open_telemetry_tracing` option to True when creating
# the publisher client. This in itself is necessary and sufficient for
# the library to export OpenTelemetry traces. However, where the traces
# must be exported to needs to be configured based on your OpenTelemetry
# set up. Refer: https://opentelemetry.io/docs/languages/python/exporters/
publisher = PublisherClient(
    publisher_options=PublisherOptions(
        enable_open_telemetry_tracing=True,
    ),
)

tracer = trace.get_tracer(__name__)

# This Span will be the parent span of the span that is created in the Pub/Sub client library
with tracer.start_as_current_span("outside_span") as parent_span:

    # The `topic_path` method creates a fully qualified identifier
    # in the form `projects/{project_id}/topics/{topic_id}`
    topic_path = publisher.topic_path(topic_project_id, topic_id)
    # Publish messages.
    for n in range(1, 2):
        data_str = f"Message number {n}"
        # Data must be a bytestring
        data = data_str.encode("utf-8")
        # When you publish a message, the client returns a future.
        future = publisher.publish(topic_path, data)
        print(future.result())

    print(f"Published messages to {topic_path}.")

mukund-ananthu commented 1 week ago

Closing due to no response. Feel free to reopen.

googleapis / python-pubsub