tower-rs / tower

async fn(Request) -> Result<Response, Error>
https://docs.rs/tower
MIT License
3.56k stars 281 forks source link

experiment with permit based service framework #757

Open hlbarber opened 11 months ago

hlbarber commented 11 months ago

Overview

burger is an experimental permit based service API.

pub trait Service<Request> {
    type Response;
    type Permit<'a>
    where
        Self: 'a;

    async fn acquire(&self) -> Self::Permit<'_>;

    async fn call(permit: Self::Permit<'_>, request: Request) -> Self::Response;
}

I published a few example implementations in this repository. This API is possible (migration here) without async fn in traits and just GATs. It requires a lot of pin projection madness and extra GATs.

The purpose of the issue is to collect critiques to inform the design of tower.

Motivating Questions

Why use permits?

Permits allow you to disarm a service after it's ready and can be used to enforce a tighter service contract.

Why doesn't call accept &self?

The readiness of one service does not ensure the readiness of a different service of the same type - we want to disallow sharing of permits. There are three options here:

  1. Pass the innards required for call from &self to the permit.
  2. Use some sort of branding. This adds a lot of complexity.
  3. Ignore the problem - service authors can implement runtime checks to prevent sharing if they really care.

Choosing 1 is safe and less obscure than 2.

Why does call take ownership of the permit?

A permit should allow only one call.

Why is Service::Permit<'a> a GAT?

We want to be able to pass the innards of &self into the Self::Permit<'_> by reference. Cloning Arcs from the &self to Self::Permit will result in poor performance and developer experience.

Why does fn acquire accept &self rather than &mut self?

If it accepted &mut self we would only ever be able to obtain one permit at a time.

Why async fn acquire rather than fn acquire like tower::Service::poll_ready?

Both approaches boil down to the same kind of state machines eventually. Using Future allows for easy composition with the large Futures ecosystem and with other Service::acquire calls.

Why do async fn acquire and async fn call not return a Result?

Most of the Service style combinators work without Result.

If the user wants to write a Service with a fallible async fn acquire then they can model the permit as a Result and have call return the Err. If the user wants to write an infallible acquire and a fallible call the signatures are no longer coupled by convention alone.

Perhaps the value of acquire returning a Result outweighs the flexibility though.

Split this into two traits?

We could split the Service trait into Acquire and Call where Call is implemented on the permit and has async fn call(self, request: Request). I have no strong opinions on this. Maybe this helps with object safety?

hlbarber commented 11 months ago

I wasn't aware at the time, making a note of it now - if we did split this trait in two it would become close to the suggestion by @olix0r https://github.com/tower-rs/tower/issues/626#issuecomment-1009256748.

hlbarber commented 11 months ago

Referencing related discussions:

LegNeato commented 11 months ago

How does this interact with drop and cancelation? Is it better or worse than the current model? This reminds me a lot of completion based io for some reason (https://www.ncameron.org/blog/async-io-with-completion-model-io-systems/).

hlbarber commented 11 months ago

How does this interact with drop and cancelation? Is it better or worse than the current model?

I don't think the design here addresses the lower-level problems relating to async drop if that's what you mean, but it does address the disarm problem.

I like to think about Service::acquire as a generic version of Semaphore::acquire.

Under current contract, the Service::poll_ready documentation states:

Note that poll_ready may reserve shared resources that are consumed in a subsequent invocation of call. Thus, it is critical for implementations to not assume that call will always be invoked and to ensure that such resources are released if the service is dropped before call is invoked or the future returned by call is dropped before it is polled.

And citing OP of the disarm thread:

Currently if poll_ready returns Ready it effectively reserves something (for instance a semaphore token). This means you must be following up with call next. The only other option is to Drop the service which however is not always possible.

The implementation here solves this problem because you can deallocate shared resource prior to Service:call in the Drop implementation of Service::Permit. Under this approach, it's natural to hold a handle to a resource in the permit to allow access during Service::call.

hlbarber commented 11 months ago

I've now implemented a decent percentage of the existing tower middleware and published it. Here are some obvious and subtle obstructions I've observed.

Problems common to all tower "async fn in trait" designs:

The following are specific to Service::Permit<'a> being a GAT: