Open rudolf opened 2 years ago
Some technical observations:
A single HTTP Agent is capable of handling multiple sockets pools, one per upstream target. It also supports a maxTotalSockets option, which allows defining a global limit on the number of open connections.
On the other hand, undici's Pool (used by UndiciConnection) is bound to a single upstream target (aka a single node), and holds a pool of Client connections to that origin. Even though undici's Pool allows limiting the maximum number of connections, AFAICT this limit cannot be set globally across all nodes at undici level.
Thus, if we want to define a global limit on open sockets:
maxTotalSockets
.connections
parameter.This issue is stale because it has been open 90 days with no activity. Remove the stale
label, or leave a comment, or this will be closed in 14 days.
This issue is stale because it has been open 90 days with no activity. Remove the stale
label, or leave a comment, or this will be closed in 14 days.
This issue is stale because it has been open 90 days with no activity. Remove the stale
label, or leave a comment, or this will be closed in 14 days.
This is blocking Kibana from adopting the undici transport https://github.com/elastic/kibana/issues/116087
Good to know. :+1: I'll do my best to prioritize it for 8.16.
Some notes so far:
I looked at Undici's BalancedPool
, and we'd definitely need to contribute an upstream change there to support max socket enforcement. The other challenge is that, because Undici is leveraged at the Connection
layer in the transport, rather than the ConnectionPool
layer, we don't have a straightforward way to share a BalancedPool
across Connection
instances. Using agent
in a creative way could work there, but I need to think about it more.
I also tried to see if it could be enforced at the ConnectionPool
layer, by creating a ConnectionPool class that, when Connection
is an UndiciConnection
, keeps track of PoolStats
across all connections. The main catch there is that ConnectionPool.getConnection
is a synchronous method, so there's no way to defer execution when there are no available sockets. Technically that is a transport implementation detail, so I could update all ConnectionPool
classes to make getConnection
async, but that could also break the contract for users who have written their own ConnectionPool
classes, unless I allow the function to return either a value or a Promise
.
So, I have some things to think about here. The work will continue!
I spent the day building a ConnectionPool
class that is largely a wrapper around Undici's BalancedPool
. It seems to have some potential, but will require a lot of greenfield development of new Connection
and ConnectionPool
classes. If we want to make BalancedPool
available to all three of the ConnectionPool
strategies (weighted, cloud, cluster), it will require a pretty significant refactor.
I need to move on to some other priorities for a bit, but this is something I want to revisit soon and see if there are other possible implementations I'm overlooking.
Undici has an open issue to support configuring max sockets. I'm going to see if I can make a contribution there to get that functionality included.
🚀 Feature Proposal
It should be possible to configure the client to control the max open sockets and max idle sockets across all nodes. In addition, the client should expose diagnostic information about all the open/idle sockets for observability.
Motivation
We want to have more control over the amount of open sockets Kibana creates when using elasticsearch-js.
It seems like by default elasticsearch-js will create an agent for each node:
While it's possible to specify the option
agent: () => http.Agent
which can return a single agent instance for use withHttpConnection
it doesn't seem possible to use a single Undici pool for all nodes.As a result, it's not possible to configure the maximum open and idle sockets across all connections/nodes in a way that's compatible with both
HttpConnection
andUndiciConnection
.We seem to be doing round robin load balancing across all nodes: https://github.com/elastic/elastic-transport-js/blob/81316b1e0d01fadf7ada678bb440af56c6f74f4d/src/Transport.ts#L236
~But because nodes don't share a connection pool, it seems to diminish the value of the WeightedPool. If a Node goes down the client will still choose that Node in a round-robin fashion sending 1/N requests to the dead node.~ WeightedPool ignores the
selector
parameter togetConnection