[Question] Request/Response Access in Handler

github / twirp-rs

Twirp RPC for Rust

MIT License

20 stars 2 forks source link

[Question] Request/Response Access in Handler #17

Closed nickpresta closed 3 months ago

nickpresta commented 4 months ago

Hello 👋

I'm wondering what's the recommended way to access the underlying request/response within a handler, or to be able to pass data between middleware (e.g. created with Axum's middleware::from_fn/middleware::from_fn_with_state) and the handler function itself.

In other Rust crates, like tonic, or prost-twirp, the generated service handlers have access to the underlying Request, which allows for a couple of ways to pass around per-request scoped data:

Extracting details from a request header directly (e.g. extract values from request.headers() using some utility function)
Adding to request.extensions_mut() and then extracting with request.extensions().get::<MyType>()

I haven't been able to emulate either of these patterns, since we don't have access to the underlying request in the handler itself (no way to extract via utility in the handler), and we can insert into request.extensions_mut() but have no way to pull it out in the handler (again, no Request is available in the handler).

I did see that we have State(api), where api is Arc::new(HaberdasherApiServer {}) but this seems to be shared state across the router and not something that's really appropriate for request-scoped data (e.g. auth tokens, request user's locale)

It looks like the generated router ends up calling:

which looks like we really have no way to get access to the underlying request.

Similarly, although not as important, there doesn't seem to be a way to get access to the underlying Response (e.g. to send back custom response headers).

Is there actually a way to access request-scoped data that is extracted from the Request, either via middleware or just pulling data out of request.headers() in the handler itself? Or is this something that would need to be added (e.g. passing along either Request, or creating a Request<MyRequestType> with an into_inner() or something to grab the protobuf while having the Request available (and same for Response<MyResponseType>)? This seems to be the approach tonic takes and it feels pretty natural.

Thanks for any context and for taking on the work of a new Twirp crate for Rust.

jorendorff commented 4 months ago

You're right, there's no support for this. I think it might be best handled with a breaking change to add a Request type as you suggest, or a separate Context argument.

Alternatively, if you're spawning a task per request anyway (as axum::serve and most servers do), write a middleware that stashes the Request in a task-local variable, and wrap that middleware around the whole Router.

I have to admit, our use Twirp use cases are pretty vanilla.

nickpresta commented 4 months ago

Thanks for the confirmation.

I have a branch that I'm working on in my fork that adds a thin Request/Response wrapper around the messages in the input/output for the handler functions. I'm going to add some tests and then I'll put up the PR.

Here's an early look: https://github.com/github/twirp-rs/compare/main...nickpresta:twirp-rs:add-request-response-wrappers

If something like that seems reasonable, I should be able to put up a PR tomorrow (Wednesday, March 20). This API emulates the way tonic looks and feels and generally feels pretty natural (to me, at least).

tclem commented 4 months ago

Hey @nickpresta, yeah, this is a bit of a gap right now. If we can, I'd love for the response handler signatures to stay in the form:

async fn {}(&self, req: {}) -> Result<{}, twirp::TwirpErrorResponse>

(where {} is directly from the proto schema, and not wrapped)

I understand there are cases where you need something out of the raw http request data, but I think twirp service generally benefit from specifying that data in the proto schema and passing it in the body. I'd be in favor of something closer to Go's client/server hooks, or some sort of per request state... 🤔 I'd have to think about it a bit more.

tclem commented 4 months ago

@nickpresta can you paste in a motivating example too? What raw http request data do you want to use in a handler that can't be implemented as middleware?

nickpresta commented 4 months ago

The few examples we have today are:

Extracting auth token from the Authorization and creating AuthenticationClaims that are injected into the request extensions
Extracting the locale from Accept-Language
Extracting timezone-related information from an X-Accept-Timezone header (specific to our service's use case)
Other additional tracing metadata that we need in the handler so we can forward it along to specific underlying services we communicate with

I think twirp service generally benefit from specifying that data in the proto schema and passing it in the body

I'm not opposed to this in general (e.g. don't have "special", hidden state), but looking at Twirp docs and gRPC docs, even they outline valid use cases for wanting to pass along extra metadata and having it available in the handler.

Whether these are injected into request extensions or the pulled out of the headers directly in the handler doesn't make much of a practical difference, as long as it's possible to do so.

How does GitHub pass along some of this information? Do you have a shared PB Message called Metadata or something that has values for authentication, tracing details, etc and it's included in every Request/Response message?

One advantage of having these as headers, in my opinion, is that it allows intermediate systems to add additional context without needing them to decode and reencode protobufs - an intermediate proxy could add more tracing details by adding/appending to headers without even knowing we're a protobuf API at all.

If we can solve this while keeping the signature the same, that would be great, but I just don't have enough context to know how to actually accomplish it using Axum middleware (for example) or other Tower-compatible layer mechanism.

Any help here to get started would be appreciated and I can (hopefully) take things to the finish line (or just implement entirely within my app vs making library changes)

tclem commented 4 months ago

Spent a bit more time thinking about this last night. I do think we should address this, mostly want to get the design right from the perspective of using twirp-rs. As @jorendorff said, our internal usage is pretty vanilla right now (and this library is new) but we make-do with middleware and are using tokio task local variables for the use case of forwarding metadata on to subsequent service calls. AuthZ data is passed directly, but some authN is handled as middleware and determines which routes are servable.

Here are the options I see so far:

Option 1: Request/Response wrappers

This is what you've started to sketch out. Basically we add wrapping structs to the request and response so that information beyond the rpc messages can be accessed by the handler implementations (and middleware). There's a detail to work out with the Response wrapper and errors (don't you also want TwirpErrorReponse to also be wrapped?), but otherwise I think this would be functional.

impl haberdash::HaberdasherApi for HaberdasherApiServer {
    async fn make_hat(
        &self,
        req: Request<MakeHatRequest>,
    ) -> Result<Response<MakeHatResponse>, TwirpErrorResponse> {
        Ok(MakeHatResponse::default().into())
    }
}

Option 2: Mutable context

The idea here is closer to how Go is dealing with the problem. This context would be created per-request and could be http specific (and even http header specific?). Definitely some details to sort out, the middleware would have to be given access to context as well, etc. This is a bit less functional in design and maybe too Golang centric...

impl haberdash::HaberdasherApi for HaberdasherApiServer {
    async fn make_hat(
        &self,
        ctx: &mut Context,
        req: MakeHatRequest,
    ) -> Result<MakeHatResponse, TwirpErrorResponse> {
        Ok(MakeHatResponse::default())
    }
}

Option 3: Tokio task local variables

Lean in to task locals. Here's an example pulled from some internal code. This is for detecting a request id header in the incoming http request and forwarding that on with subsequent client http calls to other services.

tokio::task_local! {
    pub static REQUEST_ID: RequestId;
}

#[derive(Clone, Debug, Eq, PartialEq)]
pub struct RequestId(String);

impl RequestId {
    /// Generate a new RequestId.
    pub fn generate() -> Self {
        Self(Uuid::new_v4().to_string())
    }

    /// Look for a RequestId in the `req` header. If not found, generate a new one.
    pub fn from_request<B: Body>(req: &Request<B>) -> Self {
        if let Some(rid) = req
            .headers()
            .get(headers::GITHUB_REQUEST_ID)
            .and_then(|v| v.to_str().ok())
        {
            Self(rid.to_string())
        } else {
            Self::generate()
        }
    }

    /// Set the `X-GitHub-Request-ID` header on the given `reqwest::Request`. This will overwrite
    /// any existing header value. In most cases you want to use `request_id::forward` instead which
    /// won't overwrite headers and properly adds the `X-Glb-Via` header.
    pub fn set(&self, req: &mut reqwest::Request) {
        req.headers_mut().insert(
            headers::GITHUB_REQUEST_ID,
            self.0.parse().expect("failed to parse request_id"),
        );
    }
}

What else, other options here in the design space?

Twirp really wants http headers to be a transport implementation detail, but as you pointed out, acknowledges that custom headers are sometimes necessary and I think the question we're trying to ask is: for twirp-rs, how should handlers get access to read request http headers and write response headers.

nickpresta commented 4 months ago

The task local approach isn't something I've generally seen in my (limited experience) when looking at the most popular web frameworks and tonic in Rust. I guess there isn't anything wrong with it, but it feels more abnormal than the code you're likely to see in other web frameworks, tutorials, etc, and requires you to know a bit about tokio where you otherwise wouldn't really need to.

My initial gut reaction is that it feels like using task locals is just reinventing request extensions which are meant to capture and make available any Send + Sync + 'static values and integrates with the various ecosystem crates like tower/tower-http/hyper/axum.

—-

Is the issue with the HTTP headers specifically? If we limited to functionality to a Request/Response wrapper with Extensions would that be fine? How the data ends up in there isn't a concern of twirp-rs and anything that integrates with request extensions (i.e. everything?) would work well and consumers would have a familiar interface for sharing data?

Even taking Go's context approach, which seems perfectly reasonable, I would still argue that Extensions can fulfill that role instead of creating a new Context type that's specific to twirp-rs.

In the branch I have, the Request/Response wrappers have both extensions and headers but I can easily move the header extraction code into a simple axum::middleware::from_fn(thing) and insert into the request extension. Does that make the wrapper approach more palatable while leaving room for additional needs - adding on to Request/Response instead of being stuck with a single Context?

EDIT: I pushed to my branch with only Extensions and I'm able to migrate all my service code to extract from request.extensions() and added middleware that does the header extraction and pushes into request.extensions_mut() without issue, so functionally this satisfies my need.

In terms of having "response extensions" on the TwirpErrorResponse, it isn't an issue for us with the code I have today, at least. We have Twirp error metadata that can be used for additional information, and none of the response headers we want to set (again, based on our needs today) are valid in the error case other than timing information, which is already available in Timing and can be pulled out in our tracing middleware.

tclem commented 4 months ago

I think the Request/Response wrapping is probably fine and I like the approach of just dealing with Extensions. I'm still a little 😢 about how that makes all handlers have to deal with one additional layer of wrapping - instead of directly dealing with their domain.

Can we do a quick sketch of passing a mut &Extensions param? That would also solve the response extensions problem. It does still make all handlers have to update their function signatures, but I guess you can _ ignore the parameter if it's not needed...

It might not be better and I like where your current branch is going too - worth opening as a PR for sure.

nickpresta commented 4 months ago

I should be able to put up another branch tomorrow 🤞 with a new mut &Extensions parameter so it's easy to compare between them. Do you have any preference on its placement? Should we keep it similar to Go and have it as the first parameter for "context"?

tclem commented 4 months ago

Do you have any preference on its placement? Should we keep it similar to Go and have it as the first parameter for "context"?

Yeah, first param sounds good.

nickpresta commented 4 months ago

Thanks for all the back and forth on this. I didn't mention this in my original message but we're just starting to use Twirp at Shopify and we'd love to contribute to and help maintain a high-quality crate and what's published here looks really good, so thank you for sharing the code and talking through this feature ❤️

I didn't get to spent too much time on this today but I keep running into lifetime issues in the router where calling async move { api.make_hat(extensions, req) } where extensions is now &mut Extensions (side note: we want a mutable reference to Extensions vs mutable binding to immutable Extensions with mut &Extensions, right?) resulting in an error around the async move closure outliving the router function.

I tried a bunch of things, like cloning api inside the closure, switching make_hat to take Arc<Self> instead of a reference to self, etc but couldn't figure out a way to resolve the issues.

I pushed a new commit to the same branch https://github.com/github/twirp-rs/commit/cb2f2600abc89b6306c0a5bd9f18754634431dd8 so it can backed out if we don't want to go down this path.

Would you happen to have any bandwidth to take a look at this next week?

Thanks and have a great weekend!

tclem commented 3 months ago

Ah, yes—the combination of a &mut Extensions and async move is going to be tricky because that move takes ownership and the lifetime of that async block isn't necessarily related to the lifetime of the calling context. It might be enough to have something like an Arc<Mutex<T>> as I think the core of the problem is synchronization of the extensions. That is to say: &mut Extensions might not work here...

Let me play with it a bit, I've been out and about all weekend, but I'll try and push a little code tomorrow.

Thanks for all the back and forth on this. I didn't mention this in my original message but we're just starting to use Twirp at Shopify and we'd love to contribute to and help maintain a high-quality crate

I appreciate this and thanks for your patience as well! Love seeing this be valuable to others and definitely excited to improve the rough edges here.

tclem commented 3 months ago

OK, here's a rough sketch. I need to think through the implications of that mutex and maybe it would be nice to provide a better API for the common cases of get and insert and/or have the mutex be a bit more of an implementation detail.

nickpresta commented 3 months ago

Thanks for putting that together! I agree that it would be nice if there was a way to hide the Mutex itself and not needing specific lock() calls and error handling for them.

I wonder if it's worth splitting out extensions into extensions and extensions_mut and have Context be:

#[derive(Default)]
pub struct Context {
    pub extensions: Extensions,
    pub extensions_mut: Mutex<Extensions>,
}

The common use case, at least based on how our services currently look using tonic and its Extensions is:

Grab a bunch of stuff out of Context
Run your actual business logic
Potentially insert stuff into Context based on the outcome of step 2 (e.g. debugging details)

By having a single extensions in a Mutex, the handler needs to grab a lock for essentially the entire execution:

#[async_trait]
impl haberdash::HaberdasherApi for HaberdasherApiServer {
    async fn make_hat(&self, ctx: Arc<Context>, req: MakeHatRequest) -> Result<MakeHatResponse, TwirpErrorResponse> {
    if req.inches == 0 {
        return Err(invalid_argument("inches"));
    }

    // lock acquired
    let mut extensions = ctx.extensions.lock().expect("Couldn't lock extensions");
    let rid = if let Some(id) = extensions.get::<RequestId>() { id.clone() } else { RequestId("didn't find a request_id".to_string()) };
    let other_thing = if let Some(thing) = extensions.get::<Thing>() { thing.clone() } else { Thing("didn't find a thing".to_string()) };

    // I can't create a scope around rid and other_thing because I need them here
    // lock still held for however long business_logic takes
    let res = match business_logic(rid, other_thing).await {
        Ok(value) => {
            extensions.insert::<Debugging>(value.some_debugging_thing);
            // use value to create response
            MakeHatResponse { ... }
        },
        Err(err) => {
            return Err(internal(err));
        }
    };

    // lock finally able to be released when handler returns
    return Ok(res);
}

I don't know the impact on performance, and since extensions should only ever be accessed by the actual handler for that specific request, maybe it's fine to hold the lock the whole time, but it feels like generally good practice to have the lock acquired for as little as possible. Am I understanding correctly in that there really isn't any way for concurrent access to extensions unless the handler implementation chooses to have multiple threads reading/writing to extensions (as opposed to the top-level handler code reading/writing directly)?

I don't think we can make any assumptions about Extensions being read more than being written so it doesn't seem like switching to RwLock is a better choice, if even there could be a subtle performance different between RwLock and Mutex in this case.

As an alternative, I checked out your branch and added extensions and extensions_mut in Context so the handler looks a bit different:

#[async_trait]
impl haberdash::HaberdasherApi for HaberdasherApiServer {
    async fn make_hat(&self, ctx: Arc<Context>, req: MakeHatRequest) -> Result<MakeHatResponse, TwirpErrorResponse> {
    if req.inches == 0 {
        return Err(invalid_argument("inches"));
    }

    // no lock needed, only reading
    let extensions = &ctx.extensions;
    let rid = if let Some(id) = extensions.get::<RequestId>() { id.clone() } else { RequestId("didn't find a request_id".to_string()) };
    let other_thing = if let Some(thing) = extensions.get::<Thing>() { thing.clone() } else { Thing("didn't find a thing".to_string()) };

    let res = match business_logic(rid, other_thing) {
        Ok(value) => {
            {
                // lock acquired, insert performed, and then released
                ctx.extensions_mut.lock().expect("Couldn't lock extensions").insert::<Debugging>(value.some_debugging_thing);
            }
            // use value to create response
            MakeHatResponse { ... }
        },
        Err(err) => {
            return Err(internal(err));
        }
    };

    return Ok(res);
}

This also has the benefit of not needing to deal with a Mutex at all if you never need to update the context within your handler, which feels less common than wanting to read from request context. This is basically the same idea as splitting up request/response extensions, where you read from extensions and write to extensions_mut. The implementation in server needed to have resp.extensions_mut().extend(ctx.extensions_mut.lock().unwrap().clone() added to ensure the newly written data is available in a middleware as for the response.

I think if we were able to get away from the mutex entirely, this API would feel nice maybe even nicer than the Request/Response wrappers since a single Context argument could be ignored if you don't need it, whereas wrappers always exist, but there's something about having to even think about Mutex that makes this feel like a worse option especially when the lifetime of the lock may have performance implication, and introduces a potential new error path(s) that needs to be handled just to use the framework.

Ultimately, I think both styles of solution work, and I'm assuming there really isn't any material performance penalty with the Mutex since ctx.extensions isn't actually shared global state. If you think this is a better API than the Request/Response wrappers for consumers and to maintain in the long run, I can support that choice 🙂

tclem commented 3 months ago

I wonder if it's worth splitting out extensions into extensions and extensions_mut and have Context be:

This is a good idea and the mutability is really only required for writing any response extensions. I'm my spike I extend the http response extensions with all request extensions and that might not be the desired behavior. I like this idea of, similar to how Request and Response both have their own extension maps, giving this Context a structure like you've suggested:

#[derive(Default)]
pub struct Context {
    pub request_extensions: Extensions,
    pub response_extensions: Mutex<Extensions>,
}

This would leave it up to any middleware you might write to copy over extensions from the request to the response if that's the behavior you want.

By having a single extensions in a Mutex, the handler needs to grab a lock for essentially the entire execution:

I don't think this is actually true or the suggested way to use a mutex. Unless you need exclusive access to the extensions map, a more efficient pattern would be to take and release the lock inline, each time you need it. This would be important for using the standard sync Mutex as it's invalid to hold a lock across await points (we could switch to the tokio version which would allow that, but I don't think it's really necessary to hold the lock that long). These are also created per-request - so any contention would be from code in handlers.

nickpresta commented 3 months ago

This would leave it up to any middleware you might write to copy over extensions from the request to the response if that's the behavior you want.

Just so I'm understanding correctly — we would still copy Context.response_extensions via resp.extensions_mut().extend(...) in server.rs, right? This way all middleware gets access to values inserted by the handler in response.extensions(), which can then be put into response headers, etc.

Unless you need exclusive access to the extensions map, a more efficient pattern would be to take and release the lock inline, each time you need it. This would be important for using the standard sync Mutex as it's invalid to hold a lock across await points

Good point about passing the lock across await boundaries. If the business_logic wasn't async, the lock would still need to be held open during its execution, right?

With the switch to response_extensions being the only thing in a Mutex, it's definitely possible to scope the writes to the smallest block possible so I think we're OK if we move in that direction.

With the Mutex being per-request, in the typical case where there is no content within the handler itself, it seems like there wouldn't really be any performance penalty. Swapping to request_extensions and response_extensions seems like a nice win, especially if the handler never needs to deal with writing into response_extensions.

So I guess the question comes back to which API feels nicer to use and maintain: wrappers or Arc and (sometimes) Mutex?

tclem commented 3 months ago

🎩 to @jorendorff for consulting with me a bit on this today. There likely is a way to solve the original lifetime issues with &mut - but it would require defining our own version of the Fn trait, specialized to our handlers, and it's more work than I'd like to take on here (We didn't actually follow it all the way through, but it should be possible to line up the lifetimes properly).

@nickpresta, take a look at the current state of my branch and let me know what you think (and feel free to copy parts of that code out into your branch). I sketched out how to make all the Arc<Mutex<T>> stuff just an inner implementation detail of context and only for the response extensions.

So I guess the question comes back to which API feels nicer to use and maintain: wrappers or Arc and (sometimes) Mutex?

In my opinion, and especially if we can hide away the Arc and Mutex, it's nicer to have the rpc handlers be entirely business/domain focused -> they get a param that's the proto message you defined and they return a result that's exactly the proto message from the rpc definition. If you have the need for using http headers and other raw http data in your business logic, that's fine and easy to get to. Otherwise you can just ignore that Context param entirely.

nickpresta commented 3 months ago

Love it! Thank you both!

I confirmed with a small change that pushing into response headers set via ctx.extensions.insert() works as expected. I agree that hiding the Arc and Mutex makes this really nice to use, especially if you don't really care about context at all.

Is that branch something you're happy to merge? The additional tests you added + my example client change (selfishly added as it matches our use case) seems to cover the functionality.

I will work tomorrow on wrapping up the conversion to twirp-rs (away from gRPC and tonic) just to validate that there aren't any surprises, but I don't foresee any issues.

EDIT Swapped over to commit 2c992a0 and updated my code and things are looking good. I have a few "extractor" functions that take &Context and pull out details using .get::<T>() and fallback to some default/None and then my middleware for pushing into response headers looks like:

pub async fn debugging_headers_middleware(req: Request, next: Next) -> Response {
    let mut res = next.run(req).await;
    let extensions = res.extensions().clone();
    let headers = res.headers_mut();

    if let Some(admin_page_debugging_url) = extensions.get::<AdminPageDebuggingUrl>() {
        headers.insert(
            ADMIN_PAGE_DEBUGGING_URL_HEADER,
            admin_page_debugging_url.0.parse().unwrap(),
        );
    }

    res
}

:chefs-kiss:

tclem commented 3 months ago

Awesome, thanks for all the back and forth here @nickpresta ❤. I've opened https://github.com/github/twirp-rs/pull/21 for review. (I merged in your changes and squashed everything up.)