restatedev / restate

Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
https://docs.restate.dev
Other
1.67k stars 38 forks source link

Pooling #76

Open slinkydeveloper opened 1 year ago

slinkydeveloper commented 1 year ago

The goal of this issue is to implement connection pooling within the invoker. Probably depends on https://github.com/restatedev/restate/issues/96

slinkydeveloper commented 1 year ago

Requirements for the pool (ordered by priority):

Non-requirements:

slinkydeveloper commented 1 year ago

Different network deployments to support:

slinkydeveloper commented 1 year ago

Now that https://github.com/restatedev/restate/pull/237 is in place, we could implement a simple pooling strategy by sharing the hyper::Client among invocation tasks. This is still problematic with vanilla kubernetes pods (only L4 load balancing), and can still block when the max-streams is reached (as no new connections are opened when quota is reached), but the suspension timeout guarantees that after a while stream quotas are released for invocations waiting for very long time on completions.

slinkydeveloper commented 1 year ago

I've opened #293 with the simple pooling strategy that shares the hyper::Client. Let's keep this issue open though, as I think #293 is only a temporary solution but not the long term strategy.

jackkleeman commented 2 days ago
  1. We now have a service client per pp, and they don't share connection pools at all. I can see benefits and drawbacks to this, just thought it was worth noting
  2. Hyper-h2 respects the http2 max concurrent stream limit (which can be as low as 100-250) but the hyper legacy client we use doesn't do so very intelligently (ie, by opening a new tcp connection) - the singular tcp connection will still be chosen for every http2 request to the endpoint and then it will just block waiting for a slot. We may want to consider if we want to have a cleverer pooling mechanism as inevitably someone is going to hit this limit even if its per pp