Pooling - Githubissues

restatedev / restate

Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.

https://docs.restate.dev

Other

1.67k stars 38 forks source link

Pooling #76

Open slinkydeveloper opened 1 year ago

slinkydeveloper commented 1 year ago

The goal of this issue is to implement connection pooling within the invoker. Probably depends on https://github.com/restatedev/restate/issues/96

slinkydeveloper commented 1 year ago

Requirements for the pool (ordered by priority):

Reuse TCP/TLS connections across invocations
Be able to always provide quota for a new invocation, eventually opening a new tcp connection when the http2 max concurrent streams is exhausted. This is required to avoid logical deadlocks!
For use cases where no load balancing exists at L7, provide a tunable load balancing policy to decide whether to reuse an existing tcp connection or open a new one
When possible, hit the endpoint that previously served service id to allow state caching

Non-requirements:

Support HTTP/1

slinkydeveloper commented 1 year ago

Different network deployments to support:

Protocol is in request/response mode and there is an API gateway/L7 load balancer between restate and the service. This is in particular the case of AWS Lambda
Protocol is in bidi-stream mode and there is an API gateway/L7 load balancer between restate and the service. Example load balancers are Envoy/Linkerd
Vanilla Kubernetes direct pod to pod communication with a vanila Service deployment. This can be either L4 load balancing with vIPs or DNS.

slinkydeveloper commented 1 year ago

Now that https://github.com/restatedev/restate/pull/237 is in place, we could implement a simple pooling strategy by sharing the hyper::Client among invocation tasks. This is still problematic with vanilla kubernetes pods (only L4 load balancing), and can still block when the max-streams is reached (as no new connections are opened when quota is reached), but the suspension timeout guarantees that after a while stream quotas are released for invocations waiting for very long time on completions.

slinkydeveloper commented 1 year ago

I've opened #293 with the simple pooling strategy that shares the hyper::Client. Let's keep this issue open though, as I think #293 is only a temporary solution but not the long term strategy.

jackkleeman commented 2 days ago

We now have a service client per pp, and they don't share connection pools at all. I can see benefits and drawbacks to this, just thought it was worth noting
Hyper-h2 respects the http2 max concurrent stream limit (which can be as low as 100-250) but the hyper legacy client we use doesn't do so very intelligently (ie, by opening a new tcp connection) - the singular tcp connection will still be chosen for every http2 request to the endpoint and then it will just block waiting for a slot. We may want to consider if we want to have a cleverer pooling mechanism as inevitably someone is going to hit this limit even if its per pp