envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.93k stars 4.8k forks source link

perf: consider allowing upstream connections to be shared by workers #8702

Open mattklein123 opened 5 years ago

mattklein123 commented 5 years ago

The full silo model works out well in many cases, but in some cases (especially low memory) it may be better to allow a connection pool to be shared across the workers. This would cut down on the number of connections used, have greater connection pool hit rate, etc. at the expense of increased synchronization when the workers need to access the connection pool.

I think this could be implemented somewhat cleanly by allowing for a connection pool implementation that is shared across workers and uses its own processing thread.

This has already come up in discussion with @HenryYYang WRT to redis and I think @antoniovicente had also brought this up.

I'm opening this issue up for discussion as well as potentially a full design proposal if there is interest.

zyfjeff commented 5 years ago

In addition to allowing connection pools to be shared between worker threads, I think we can also consider implementing a multiplexed connection pool, not all of which are synchronous calls of http1.1, such as Dubbo, thrift, multiplexing. To implement a multiplexed connection pool, we need to abstract out the interface, provide the protocol layer to associate requests and responses, the design is still very difficult, at present, we only implement a multiplexed connection pool for the Dubbo protocol in the company.

mattklein123 commented 5 years ago

the design is still very difficult, at present, we only implement a multiplexed connection pool for the Dubbo protocol in the company.

We already multiplex HTTP/2? In general though I agree that there is opportunity to do something general here that can cover multiple protocols. Unless someone gets to it first I'm probably going to look at this once I finish UDP proxy support.

antoniovicente commented 4 years ago

Sharing a HTTP/2 upstream connection among multiple workers seems challenging due to the need to interact with the HTTP/2 connection from multiple threads. HTTP/2 does have a stream abstraction, so it may be possible to write some per-stream wrappers that mediate the cross-thread interaction between the filter chain and the actual HTTP/2 framer and sockets that operate in a different dispatcher thread. This approach may suffer from significant lock contention and complexity.

For simpler protocols like HTTP/1.1 that have no multiplexing and no cross request state, it is possible to detach the socket objects from the upstream connection and transfer the sockets from one worker thread to another. See #8813 for a prototype.

stevenzzzz commented 3 years ago

/cc stevenzzzz

stevenzzzz commented 3 years ago

cc @stevenzzzz cc @lambdai

lambdai commented 3 years ago

Sharing connection among threads introduces lock contentions for sure.

I have an alternative approach. Suppose the cluster has N endpoint and X workers. What if we shard the endpoints among workers?

Each worker load balances upstream request around N / X endpoints. This can be done either introducing a new load balancer, or as simple as maintain a per-worker view of endpoints in worker's ThreadLocalCluster?

CC @antoniovicente @mattklein123

oschaaf commented 3 years ago

I wonder if migrating live connection across threads is something that could break extensions. E.g. here's a PoC one that gets back from a foreign thread and posts something to a dispatcher to request a continuation of transaction processing. https://github.com/apache/incubator-pagespeed-mod/blob/master/pagespeed/envoy/envoy_base_fetch.cc#L64

mattklein123 commented 3 years ago

Each worker load balances upstream request around N / X endpoints. This can be done either introducing a new load balancer, or as simple as maintain a per-worker view of endpoints in worker's ThreadLocalCluster?

Yeah, I agree this could work (internal subsetting), though IMO it's a bit different from what this issue tracks, and I would recommend that we open a different issue for that. I think this issue is still worth considering as I think the lock contention may be worth it in exchange for much reduced memory usage in certain cases.

antoniovicente commented 3 years ago

The contention involved should be minimal since the cross thread communication should only happen as the connection is attached to a request and returned to the pool. All the writes and event handling would happen on the thread handling the request.

Of course, this mostly makes sense for medium QPS services. For high QPS services, there is enough traffic to justify separate connections per thread. For low QPS services, you probably don't have a connection to the upstream on any worker thread, so sharing the connections doesn't help if connections don't exist.

Subsetting belongs in a different issue. Subsetting can't really address the needs of medium QPS services with few endpoints behind them.

jnt0r commented 3 months ago

Is there anything new about this? Woul love to see this in envoy as it could drastically decrease memory usage in our envoys.

wdauchy commented 1 month ago

Sharing a HTTP/2 upstream connection among multiple workers seems challenging due to the need to interact with the HTTP/2 connection from multiple threads. HTTP/2 does have a stream abstraction, so it may be possible to write some per-stream wrappers that mediate the cross-thread interaction between the filter chain and the actual HTTP/2 framer and sockets that operate in a different dispatcher thread. This approach may suffer from significant lock contention and complexity.

For simpler protocols like HTTP/1.1 that have no multiplexing and no cross request state, it is possible to detach the socket objects from the upstream connection and transfer the sockets from one worker thread to another. See #8813 for a prototype.

there could be an intermediary solution to avoid too many locks: each thread continues to maintain its pool of connection but gives them back to a global pool after a period of time when it does not need it. That way, before creating a connection, a thread would look into the global pool, and take ownership instead of creating a new one. Once done, the connection will be given back to the global pool. this would be especially useful for upstreams which are not highly used, in particular on large fleet of envoy with a lot of threads. the workflow I have in mind for each worker would be: