Provide a minimal example for propagating request IDs into the distributed tracing system (w/ Axum)

getsentry / sentry-rust

Official Sentry SDK for Rust

https://sentry.io/

Apache License 2.0

620 stars 153 forks source link

Provide a minimal example for propagating request IDs into the distributed tracing system (w/ Axum) #656

Open seanpianka opened 6 months ago

seanpianka commented 6 months ago

Hello there! Is it possible for any maintainer here to help me configure distributed tracing using Axum, Tracing, and Sentry?

I am following the guide on custom instrumentation for distributed tracing here posted here. However, this guide does not show how to inject an existing request ID from an incoming HTTP request into the system for distributed tracing.

I am generating a random request ID (x-request-id) and propagating the header using the Tower middleware. The following two layers are added to handle this for every request made to my Axum router.

As a part of my system, I also initialize a tracing Subscriber with the standard sentry_tracing::layer() (not shown, before the code below). After this is initialized in a tracing Registry, the sentry_tower::{SentryHttpLayer, SentryLayer} layers are added to my Axum router (with http/axum-matched-path features enabled).

Here's a snippet from my codebase (there's many LOC so it would take more time to create an MVCE):

        let sentry_hub = sentry::Hub::current(); // use same hub as the sentry tracing layer?
        Router::new()
            .nest("/v1", services) // services is an Axum router
            .layer(ServiceBuilder::new()
                .layer(CorsLayer::new().allow_origin(Any).allow_methods(Any))
                .catch_panic()
                .trim_trailing_slash()
                .trace_for_http()
                .set_x_request_id(RandomRequestId::default()) // UUID v4
                .propagate_x_request_id()
                .layer(sentry_tower::SentryLayer::<_, _, Request>::new(sentry_hub))
                .layer(sentry_tower::SentryHttpLayer::with_transaction())
            )

In my Sentry dashboard, I see that new exceptions and events contain the header data from each request, including x-request-id. However, the transaction field for each event is (empty string), indicating that the with_transaction() layer variant does not work as intended (?). Additionally, each event has no custom trace ID that I can see, so it does not seem to be instrumented correctly either.

I have been at this for several days and I would greatly appreciate any help or guidance you could provide.

Thanks, Sean

Swatinem commented 6 months ago

I would recommend using new_from_top as in the example from the docs: https://docs.rs/sentry/latest/sentry/integrations/tower/index.html#usage-with-tower-http, as otherwise you would run into problems with request isolation.

Otherwise, the SDK is currently only using the sentry-trace header documented here: https://develop.sentry.dev/sdk/performance/#header-sentry-trace So it will not pick up the x-request-id header.

seanpianka commented 6 months ago

Thanks for the quick response, @Swatinem! I've since switched the initial SentryLayer tower layer to use new_from_top, thanks for that tip.

However, I do not see the sentry-trace header getting propagated in any request, nor do exceptions or events get associated with a transaction. I do see that each event within an exception has a trace ID, however I am not sure how to propagate this myself to the API's frontend (website) client such that it could use these values.

Is it intended that clients generate the sentry-trace = traceid-spanid header values themselves? If so, is there a standard way to do this with the tracing library such that it uses the appropriate values automatically?

Thanks for your time!

seanpianka commented 6 months ago

Apologies for double-posting, but is this still the case? https://github.com/getsentry/sentry-rust/issues/604#issuecomment-1680079898

I am currently initializing sentry (sentry::init(..)) after I start the async runtime (tokio). The architecture of my project initializes Sentry alongside other app/infra services that are async. Could this affect my events from being included in a transaction, and would it be necessary to re-architect my project to allow for initializing Sentry from a sync context?

Thanks again for your effort, it's very appreciated!

Swatinem commented 6 months ago

I am currently initializing sentry (sentry::init(..)) after I start the async runtime (tokio).

Yes, you should switch that around. Sentry automatically inherits a per-thread Hub from the Hub initialized on the main thread. That won’t work if you start the thread pool before initializing the main Hub.

nor Sentry's tracing layer registered with my subscriber registry

As you edited your comment, I believe you have since added the sentry_tracing::layer() to your subscriber config?

I do not see the sentry-trace header getting propagated in any request Is it intended that clients generate the sentry-trace = traceid-spanid header values themselves? If so, is there a standard way to do this with the tracing library such that it uses the appropriate values automatically?

As its pretty much impossible to have automatic hooks / monkey patching in Rust on the language level, yes this needs to be done manually.

We use this snippet in combination with reqwest to get trace propagation:

if let Some(span) = sentry::configure_scope(|scope| scope.get_span()) {
    for (k, v) in span.iter_headers() {
        request = request.header(k, v);
    }
}

seanpianka commented 6 months ago

Yes, you should switch that around.

Thanks for confirming! I can report back once I've made these changes.

As you edited your comment, ...

This was an edit for clarity, but I did have a sentry_tracing::layer() configured beforehand. It's currently configured before the Axum router and tower layers are constructed.

We use this snippet in combination with reqwest to get trace propagation

Where should one configure the Sentry scope with the span details within the context of a larger [Axum] app? Within each route handler/endpoint?

I have written a [hopefully] minimal, complete example of my application with some of the corrections you've mentioned (excluding configuring the scope). Could you point out where in the example the scope should be configured?

Summary: It defines my domain types and application state, defines a function that constructs an Axum router with the tower layers, and then an entrypoint which initializes Sentry and the tracing subscriber. It accepts requests via AWS Lambda, which passes those to an Axum router, performs some application or domain logic, and transforms any response into an HTTP response.

// My TODO app's domain type
#[derive(Serialize, Deserialize, Clone)]
struct Todo {
    id: u32,
    title: String,
    completed: bool,
}

// My TODO app's state for the Axum server
#[derive(Clone)]
struct AppState {
    todos: Arc<Mutex<Vec<Todo>>>,
}

// An axum HTTP endpoint for querying stored TODOs
async fn get_todos(State(state): State<Arc<AppState>>) -> impl IntoResponse {
    let todos = state.todos.lock().await;
    Response::builder()
        .status(StatusCode::OK)
        .body(Body::from(serde_json::to_string(&todos).unwrap()))
        .unwrap()
}

fn router(state: Arc<AppState>) -> Router {
    Router::new()
        .route("/", get(get_todos))
        .layer(
            // use svc builder to ensure correct layer ordering
            ServiceBuilder::new()
                .layer(sentry_tower::SentryLayer::new_from_top())
                .layer(sentry_tower::SentryHttpLayer::with_transaction()),
        )
        .with_state(state)
}

fn main() -> Result<(), AwsLambdaHandlerError> {
    // init sentry before starting any async runtime
    let _guard = sentry::init("DSN".to_string(), sentry::ClientOptions::default());
    tokio::runtime::Builder::new_multi_thread()
        .enable_all()
        .build()
        .unwrap()
        .block_on(async {
            // app state init is an async task
            let state = Arc::new(AppState {
                todos: Arc::new(Mutex::new(Vec::new())),
            });

            // init tracing subscriber
            tracing::set_global_default(
                Registry::default()
                    // other layers
                    .with(sentry_tracing::layer())
            );

            // aws lambda handler func builds new router with reused state for each request
            let handler_func = move |event: Request| async move { 
                // sentry tower layers are initialized here
                let router = router(state.clone());
                let resp = router.into_service().ready().await?.call(req).await?;
                Ok(resp)
            };
            lambda_http::run(service_fn(handler_func)).await
        })
}

Hopefully this is not unnecessarily long, I tried to remove anything irrelevant to the process. Thank you for your time!

Swatinem commented 5 months ago

Ohhhhh, I think I completely misunderstood this from the very beginning 🙈

I was assuming that you wanted to propagate the trace headers to outgoing http requests you are doing internally. The snippet I posted was assuming reqwest for outgoing requests.

Now I see that you want to attach the trace headers for the response. This should be doable in the SDK, but we haven’t done that so far. I would have to check with the rest of the team how to properly handle these cases.

Usually, the "client" (Browser, or other service doing the request) creates the trace and propagates the trace id. Its not common to return it in the response.

seanpianka commented 4 months ago

Yes, you should switch that around. Sentry automatically inherits a per-thread Hub from the Hub initialized on the main thread. That won’t work if you start the thread pool before initializing the main Hub.

I've made the necessary changes so that my async application to handle this.

It starts a short-lived new_current_thread tokio runtime that is used to query (HTTP, async) for a configuration file contents
It drops the runtime and calls sentry::init from a sync context on the main thread using the configuration file (its type)
Finally the entrypoint creates a new_multi_thread runtime that is used to initialize the Axum server (using the config file/type) and the underlying application

This should hopefully eliminate any issues with the Hub initializing outside the main thread.

Now I see that you want to attach the trace headers for the response. This should be doable in the SDK, but we haven’t done that so far. I would have to check with the rest of the team how to properly handle these cases.

Is it more commonplace for the frontend client to start a Sentry transaction and generate an ID for a particular user session, forward that ID value in the API requests to the backend, and have the backend attach that ID to any of its responses?

I suppose that might explain why none of my exceptions generated by the backend are attached to any transaction -- because any Sentry transaction that my frontend creates has no way to associate by ID with any of the backend exceptions.

Swatinem commented 4 months ago

Is it more commonplace for the frontend client to start a Sentry transaction and generate an ID for a particular user session, forward that ID value in the API requests to the backend, and have the backend attach that ID to any of its responses?

I suppose that might explain why none of my exceptions generated by the backend are attached to any transaction -- because any Sentry transaction that my frontend creates has no way to associate by ID with any of the backend exceptions.

Yes, the idea is that the "user initiated" action (on the frontend) starts a transaction/span, which is then propagated downward to any services that are called. I checked with the team who have confirmed that the usecase of returning a trace-id within a response is valid, but very uncommon, and none of the other SDKs have builtin support for that.