tokio-rs / tokio-service

The core `Service` trait in Tokio and support
Apache License 2.0
82 stars 19 forks source link

Proposal: Back pressure handling + take &mut self #11

Closed carllerche closed 7 years ago

carllerche commented 7 years ago

The PR represents my proposal for changes to Service.

Service::call is updated to follow Sink for handling back pressure. I also changed Service::call to take &mut self vs. &self.

/cc @aturon @alexcrichton @seanmonstar @danburkert

tailhook commented 7 years ago

Still, no polling interface? So basically, for client services you assume that service holds channel, right? So you can't make retry after interval middleware. Still no way to notify that this connection will never be ready.

For server use case it might work. But it doesn't help to stop accepting sockets under high load. It only prevents to keep too many requests hanging on this specific connection, which is already easily solved by looking at how many futures are currently waiting.

Is there an example of how it should be used?

carllerche commented 7 years ago

@tailhook

The goal of back pressure handling w/ Service (as well as Sink) is to reject a pushed value when the underlying service is at capacity while returning the value. The goal is also to notify the task when the service has additional capacity.

With the current Service trait, the implementor has two options. Always accept the request and potentially buffer infinitely or reject the request by returning it in a completed future w/ the request in the error slot. Returning the request in the error slot means that the caller is unable to be notified when the service becomes ready again.

With the proposed changes, a Service can either accept the request, or reject it and notify the caller when the service becomes ready again.

There is no polling interface, however there is nothing preventing you from having Service + Future that includes the poll fn. However, the thing to realize is that call already returns a Future which has a poll that advances it's own request / response transaction. You an today implement a middleware that retries after an interval by having the response future drive that. So there has never been an issue there.

Stopping to accept sockets under high load besides having a max number of "open" connections really depends on having application specific logic. This logic can be built on top of the proposed trait.

As for:

Still no way to notify that this connection will never be ready.

This can be done today with a separate trait, maybe PollClosed, and we can judge the API separately and merge it in as part of 0.2. So, I am punting that specific problem as part of the official 0.1 release.

danburkert commented 7 years ago

Looks great to me. One additional thing to consider is adding the equivalent of the Sink::send/Sink::start_send split to this interface, e.g.:

pub trait Service {
    type Request;
    type Response;
    type Error;
    type Future: Future<Item = Self::Response, Error = Self::Error>;

    fn try_call(&mut self, req: Self::Request) -> AsyncService<Self::Future, Self::Request>;

    fn call(self, req: Self::Request) -> Call<Self> where Self: Sized {
        Call {
            service: Some(self),
            request: Some(req),
        }
    }
}

#[must_use = "futures do nothing unless polled"]
pub struct Call<S> where S: Service {
    service: Option<S>,
    request: Option<S::Request>,
}

impl <S> Future for Call<S> where S: Service {
    type Item = (S, S::Future);
    type Error = S::Error;

    fn poll(&mut self) -> Poll<(S, S::Future), S::Error> {
        let request = self.take_request();
        match self.service_mut().try_call(request) {
            AsyncService::Ready(response) => Ok(Async::Ready((self.take_service(), response))),
            AsyncService::NotReady(request) => {
                self.request = Some(request);
                Ok(Async::NotReady)
            },
        }
    }
}

impl <S> Call<S> where S: Service {
    pub fn get_ref(&self) -> &S {
        self.service.as_ref().take().expect("Attempted Call::get_ref after completion")
    }

    pub fn get_mut(&mut self) -> &mut S {
        self.service.as_mut().take().expect("Attempted Call::get_mut after completion")
    }

    fn service_mut(&mut self) -> &mut S {
        self.service.as_mut().take().expect("Attempted to poll Call after completion")
    }

    fn take_service(&mut self) -> S {
        self.service.take().expect("Attempted to poll Call after completion")
    }

    fn take_request(&mut self) -> S::Request {
        self.request.take().expect("Attempted to poll Call after completion")
    }
}

Most or all of the other combinators on Sink have worthwhile equivalents for Service, but I think the above is most important because it may change how you name the current call function (I don't feel strongly about the names I'm suggesting, fwiw).

aturon commented 7 years ago

@tailhook

Still, no polling interface? So basically, for client services you assume that service holds channel, right? So you can't make retry after interval middleware.

Can you clarify what you mean by a "polling interface"?

Note that the PR makes Service resemble Sink somewhat more, but does not include poll_complete, because completion takes place through the returned future.

Still no way to notify that this connection will never be ready.

As @carllerche says, it seems plausible to have a separate polling method here, but also prudent to do this first with inherent methods, and only add it to the core trait once we're confident.

For server use case it might work. But it doesn't help to stop accepting sockets under high load. It only prevents to keep too many requests hanging on this specific connection, which is already easily solved by looking at how many futures are currently waiting.

Backpressure at the connection-accepting level should go through NewService rather than Service. We could consider making a similar change to that trait.

Is there an example of how it should be used?

The most immediate goal is to remove the need for unbounded queues in the client implementation, by instead propagating queue fullness as Service backpressure.

I'm out of time for this comment right now, but I'd very much like to work through a larger example of backpressure working through layers of a server.


As an aside, my biggest concerns about this change are:

But all of these are pretty vague fears, so I'm OK moving forward; it seems like a step in the right direction in any case.

tailhook commented 7 years ago

Can you clarify what you mean by a "polling interface"?

I mean something similar to poll() or poll_complete()

Note that the PR makes Service resemble Sink somewhat more, but does not include poll_complete, because completion takes place through the returned future.

Yes. But there are usually two strategies of implementing something like a service:

  1. Implement it on a single (say TCP) connection. Then it's hard to have a good strategy to keep a progress on a connection by reacting on polling futures. Because futures might be polled out of order the responses are actually arriving. I mean basically this means you need to parse protocol in any future. Which probably means that you will duplicate work on poll of every future that are in-flight for this connection.

  2. Implement it for a connection or a pool that is spawned separately from the service itself. This basically means that future returned by call is a Receiver<Response>. And you send requests to a channel. And it isn't clear what calls poll_complete on a channel. Should every future do that? It's probably wasteful. Should none of them? While it will work for channel it's a violation of the Sink interface (which channel implements).

Am I missing something? Are there good strategies to implement service in this form? Surely, all above makes sense for client service. For server side, this patch doesn't make sense, as you seem to confirm with:

Backpressure at the connection-accepting level should go through NewService rather than Service

Additionally, I disagree with the following:

As @carllerche says, it seems plausible to have a separate polling method here, but also prudent to do this first with inherent methods, and only add it to the core trait once we're confident.

The more I think of it the more I think adding it later isn't good idea. The problem is that when you have a service that already works, and then adding poll() (i.e. as a default method or as + Future) just breaks it silently. I mean service expects it to be poll()ed and user don't do it because it's not needed for service.

The most immediate goal is to remove the need for unbounded queues in the client implementation, by instead propagating queue fullness as Service backpressure.

The intent is clear. I think sink solves it just fine. And I don't buy the argument of symmetricity from #8. For example, the following:

however there is nothing preventing you from having Service + Future that includes the poll fn

.. clearly contradicts that symmetric feature.

Sorry, if I'm starting to be annoying. What I'm trying to say is that this patch doesn't solve an issue nearly as good as Sink solves. And continuing to patch interface that doesn't work with small steps is not much better than to deprecate it until we find a good way to do.

This is more of a topic of #8 rather than this issue, so I'm trying to be short on this. But it looks like user-friendly interface is going to be something more high level than call() anyway ( see redis example, same thing in our http client ). So I believe it's better to do that on an extension trait (because same trait might be used both on an individual connection and on any kind of connection pool). And it isn't a big deal to make an extension trait on service or sink. And if sink solves problem better, then I'm going to use it at the expense of few lines of code, like creating a oneshot future.

aturon commented 7 years ago

@tailhook I'm still struggling to understand the concerns, and I'm starting to think there may be some deeper disconnect we should work out.

Would you be open to a quick video call to try to hash this out more quickly? If so, shoot me an email.

carllerche commented 7 years ago

@tailhook Over the past few months, I've been thinking a lot about the issues you've mentioned. I am not sure if the proposed Service trait will be the end solution, but I think that it is a significant improvement over the current trait and is good for 0.1. We can see how the specific concerns play out and see how things evolve and if the concerns end up being real.

  1. Implement it on a single (say TCP) connection. Then it's hard to have a good strategy to keep a progress on a connection by reacting on polling futures. Because futures might be polled out of order the responses are actually arriving. I mean basically this means you need to parse protocol in any future. Which probably means that you will duplicate work on poll of every future that are in-flight for this connection.

This problem already exists even without the vector you described. Tasks can have spurious wakes or poll_complete can be called multiple times, etc.. In general, code needs to be able to short circuit quickly when there is no further work to do. I have thoughts on how to make this easier, but that is out of scope for 0.1.

  1. Implement it for a connection or a pool that is spawned separately from the service itself. This basically means that future returned by call is a Receiver. And you send requests to a channel. And it isn't clear what calls poll_complete on a channel. Should every future do that? It's probably wasteful. Should none of them? While it will work for channel it's a violation of the Sink interface (which channel implements).

While channel does implement Sink, poll_complete is a no-op. The easiest strategy here would be for channel to provide a combinator to "send" a value without taking ownership of the sink itself and return a future (which will be completed immediately) representing that send.

However, as I've said before, there is no reason that a value can't implement Service + Future such that you have a poll fn to advance the internal state.

The problem is that when you have a service that already works, and then adding poll() (i.e. as a default method or as + Future) just breaks it silently.

Adding a no-op fn would not break anything silently as far as I can tell given that the value already works without a poll fn, so it should keep working as expected.

And I don't buy the argument of symmetricity from #8. .. clearly contradicts that symmetric feature.

There is no contradiction. If a Service requires an additional polling, it can still be wrapped in a middleware like Retry (which works for both clients and servers given the symmetry of the API). The returned value still requires to be polled but the middleware doesn't care about this.

There is plenty of precedent illustrating the value of symmetric APIs. The fact that the end, high level APIs of clients shim the Service API doesn't take away from the value. The client can still internally build a middleware stack, reusing the middleware components that are written against Service and then shim the composition. There are plenty of other wins as exemplified by finagle, rack, etc...

aturon commented 7 years ago

TBH, I remain pretty confused about this thread -- I can't make sense of the issues that are being raised.

As I said before, I think it'd be helpful to have a video call or some such, to try to get on the same page. However, we're also pushing hard to get 0.1 out the door, and I don't think we should block on finding the complete solution here; we should continue to iterate.

Personally, I'm happy shipping the library with just the &mut change, or with this full PR, for 0.1. Mostly, I just want to ship :)

carllerche commented 7 years ago

Ok, I'm going to merge the &mut self and punt on AsyncService

carllerche commented 7 years ago

Closing in favor of #12 and #13