tokio-rs / axum

Ergonomic and modular web framework built with Tokio, Tower, and Hyper
19.25k stars 1.06k forks source link

Provide way to handle a request with a MethodRouter without requiring an exclusive reference #3004

Open cdhowie opened 3 weeks ago

cdhowie commented 3 weeks ago

Feature Request

Motivation

When writing a Service implementation that wraps a MethodRouter, and when the wrapping service's call method needs to do some async work before forwarding the request to MethodRouter, there is no way to avoid cloning the entire MethodRouter (or putting it behind a mutex, which is almost certainly worse). This is because MethodRouter does not expose any way to handle a request without a &mut MethodRouter, and futures returned by Service::call (understandably) cannot capture &mut self.

Proposal

Adding a method to MethodRouter that routes a request with a &MethodRouter would allow the wrapping service to hold an Arc<MethodRouter>. Changing MethodRouter::call_with_state from pub(crate) to pub would resolve this, though whether this is the correct way to resolve this request is likely a question better answered by the axum maintainers.

Alternatives

Any other approach that allows me to route a request using a &MethodRouter would be totally acceptable.

yanns commented 3 weeks ago

I'm trying to understand your needs. Could you maybe provide some code?

There's an example in the tests how to route a request using a &mut MethodRouter: https://github.com/tokio-rs/axum/blob/280d16a61059f57230819a79b15aa12a263e8cca/axum/src/routing/method_routing.rs#L1609-L1631 I guess this does not help?

cdhowie commented 3 weeks ago

This has to do with writing a Service that wraps a MethodRouter and needs to do some async work before delegating to the MethodRouter.

Rough pseudocode of a Service::call implementation:

fn call(&mut self, req: Request) -> Self::Future {
    let mut method_router = self.method_router.clone();

    async move {
        let _ = something_async().await;

        method_router.call(req).await
    }
    .boxed()
}

In our particular case, we have multiple MethodRouters and need to look up one based on some content in the body (this is a legacy API; I'm not saying this is good design, but it is what it is and we can't break compatibility with older clients).

The line let mut method_router = self.method_router.clone(); is what I'm hoping to be able to avoid. In reality it's a HashMap where the values are MethodRouters. Because we need to do async stuff before we can delegate to a MethodRouter, we have to clone the entire HashMap<_, MethodRouter> structure, but we only need to do this because MethodRouter exposes no way to handle a request via a &MethodRouter.

If there was a way to do this, we could store an Arc<HashMap<_, MethodRouter>> and clone that handle instead of needlessly duplicating the entire map structure, which is quite large.

The core issue seems to be that tower's Service::call takes &mut self but the returned future (understandably) cannot capture that borrow. This means if the service cannot delegate to an inner service synchronously then some amount of state needs to be cloned in order to delegate. The combination of Service::call needing &mut self and doing async stuff before delegating makes cloning necessary.

One alternative solution, short of exposing a way to route via a &MethodRouter, would be to add an intermediate Service implementation that wraps an Arc<MethodRouter> and delegates to that. This would still be less than ideal as we'd still have to clone the HashMap as well as one Arc for each contained router.

The tl;dr is that exposing ways for routing facilities to be invoked via a shared reference will make deep clones of routing structures per request unnecessary.

yanns commented 3 weeks ago

I've tried to reproduce your example, but I'm missing some types, and guessing has not helped for now. Maybe you could provide a complete example that compiles. My last state:

use axum::extract::Request;
use axum::routing::future::InfallibleRouteFuture;
use axum::routing::MethodRouter;
use tower::Service;

pub struct MyService {
    method_router: MethodRouter,
}
impl MyService {
    fn call(&mut self, req: Request) -> InfallibleRouteFuture {
        let mut method_router = self.method_router.clone();
        async move {
            let _ = something_async().await;

            method_router.call(req).await
        }
        .boxed()
    }
}

async fn something_async() {
    // do something async
}
cdhowie commented 3 weeks ago

Here's a minimal example. The details of how a key in the map is selected isn't relevant so I've hidden that behind a trait.

use std::{
    collections::HashMap,
    convert::Infallible,
    marker::PhantomData,
    task::{Context, Poll},
};

use axum::{
    body::Body,
    extract::Request,
    http::StatusCode,
    response::{IntoResponse, Response},
    routing::MethodRouter,
};
use futures::{future::BoxFuture, FutureExt};
use tower::Service;

trait CommandFromBody {
    fn command_from_body(body: &[u8]) -> Option<&str>;
}

struct ExampleService<C> {
    routes: HashMap<String, MethodRouter>,
    _phantom_c: PhantomData<fn() -> C>,
}

impl<C> Service<Request> for ExampleService<C>
where
    C: CommandFromBody,
{
    type Error = Infallible;
    type Response = Response;
    type Future = BoxFuture<'static, Result<Response, Infallible>>;

    fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(()))
    }

    fn call(&mut self, req: Request) -> Self::Future {
        let mut cmds = self.routes.clone();

        async move {
            let (parts, body) = req.into_parts();

            let Ok(bytes) = axum::body::to_bytes(body, usize::MAX).await else {
                return Ok(StatusCode::INTERNAL_SERVER_ERROR.into_response());
            };

            match C::command_from_body(&bytes).and_then(|cmd| cmds.get_mut(cmd)) {
                Some(router) => {
                    let req = Request::from_parts(parts, Body::from(bytes));

                    router.call(req).await
                }

                None => Ok(StatusCode::NOT_FOUND.into_response()),
            }
        }
        .boxed()
    }
}

The line let mut cmds = self.routes.clone(); clones the entire HashMap<String, MethodRouter> which is what I would like to avoid, but I cannot do this without a way to delegate to a MethodRouter via a shared reference. router.call(req) invokes the Service::call implementation on MethodRouter which takes &mut self; however, MethodRouter's Service implementation delegates this to a pub(crate) method on MethodRouter that takes &self. So routing is completely possible without mutable state, there's just no exposed API in axum that lets me do this.

yanns commented 3 weeks ago

thanks a lot. I've reproduced your case in the axum repo, and tried to use call_with_state to avoid the mutable reference: https://github.com/tokio-rs/axum/compare/reproduce_issue_3004?expand=1

But this still cannot work. It's not possible to say that the reference is living long enough for the future. If you want, you can play with this, and try different options. My knowledge is limited here. Maybe encapsulating the HashMap inside a Future and pinning it could help?

Personally, I'd go with the Arc.

cdhowie commented 3 weeks ago

Right, the idea is to wrap the HashMap in an Arc and clone that, but I can't do that today because there's no public way to route using a &MethodRouter. You need a &mut MethodRouter.

mladedav commented 3 weeks ago

One alternative solution, short of exposing a way to route via a &MethodRouter, would be to add an intermediate Service implementation that wraps an Arc<MethodRouter> and delegates to that. This would still be less than ideal as we'd still have to clone the HashMap as well as one Arc for each contained router.

You could then have Arc<HashMap<_, Arc<MethodRouter>>> so you'd only have to clone two Arcs if we added an impl Service<Request> for Arc<MethodRouter> which I think is what you're saying there.

But I generally agree that we can expose a method for calling the router without requiring exclusive ownership. I'm just not sure if we want to expose call_with_state where you'd have to pass in the state argument which will pretty much always be () or if it would be better to require you to first call with_state (if the state has not been provided yet, which it should have been before you started the app, not in a middleware) and only then allow you to call some form of call(&self, req: Request).

cdhowie commented 3 weeks ago

You could then have Arc<HashMap<_, Arc<MethodRouter>>> so you'd only have to clone two Arcs if we added an impl Service<Request> for Arc<MethodRouter> which I think is what you're saying there.

Yes, you are right. Not sure how I missed that.