elastic / elasticsearch-js

Official Elasticsearch client library for Node.js
https://ela.st/js-client
Apache License 2.0
25 stars 728 forks source link

Configure max open sockets and max idle sockets across all nodes #1740

Open rudolf opened 2 years ago

rudolf commented 2 years ago

🚀 Feature Proposal

It should be possible to configure the client to control the max open sockets and max idle sockets across all nodes. In addition, the client should expose diagnostic information about all the open/idle sockets for observability.

Motivation

We want to have more control over the amount of open sockets Kibana creates when using elasticsearch-js.

It seems like by default elasticsearch-js will create an agent for each node:

  1. Client passes nodes to pool: https://github.com/elastic/elasticsearch-js/blob/8.2/src/client.ts#L231
  2. Pool creates connection for each node: https://github.com/elastic/elastic-transport-js/blob/8.2/src/pool/BaseConnectionPool.ts#L154
  3. Each connection creates a new http agent/undici pool instance: https://github.com/elastic/elastic-transport-js/blob/8.2/src/connection/HttpConnection.ts#L83 https://github.com/elastic/elastic-transport-js/blob/8.2/src/connection/UndiciConnection.ts#L113

While it's possible to specify the option agent: () => http.Agent which can return a single agent instance for use with HttpConnection it doesn't seem possible to use a single Undici pool for all nodes.

As a result, it's not possible to configure the maximum open and idle sockets across all connections/nodes in a way that's compatible with both HttpConnection and UndiciConnection.

We seem to be doing round robin load balancing across all nodes: https://github.com/elastic/elastic-transport-js/blob/81316b1e0d01fadf7ada678bb440af56c6f74f4d/src/Transport.ts#L236

~But because nodes don't share a connection pool, it seems to diminish the value of the WeightedPool. If a Node goes down the client will still choose that Node in a round-robin fashion sending 1/N requests to the dead node.~ WeightedPool ignores the selector parameter to getConnection

gsoldevila commented 2 years ago

Some technical observations:

A single HTTP Agent is capable of handling multiple sockets pools, one per upstream target. It also supports a maxTotalSockets option, which allows defining a global limit on the number of open connections.

On the other hand, undici's Pool (used by UndiciConnection) is bound to a single upstream target (aka a single node), and holds a pool of Client connections to that origin. Even though undici's Pool allows limiting the maximum number of connections, AFAICT this limit cannot be set globally across all nodes at undici level.

Thus, if we want to define a global limit on open sockets:

github-actions[bot] commented 12 months ago

This issue is stale because it has been open 90 days with no activity. Remove the stale label, or leave a comment, or this will be closed in 14 days.

github-actions[bot] commented 9 months ago

This issue is stale because it has been open 90 days with no activity. Remove the stale label, or leave a comment, or this will be closed in 14 days.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open 90 days with no activity. Remove the stale label, or leave a comment, or this will be closed in 14 days.

rudolf commented 3 months ago

This is blocking Kibana from adopting the undici transport https://github.com/elastic/kibana/issues/116087

JoshMock commented 3 months ago

Good to know. :+1: I'll do my best to prioritize it for 8.16.

JoshMock commented 3 months ago

Some notes so far:

I looked at Undici's BalancedPool, and we'd definitely need to contribute an upstream change there to support max socket enforcement. The other challenge is that, because Undici is leveraged at the Connection layer in the transport, rather than the ConnectionPool layer, we don't have a straightforward way to share a BalancedPool across Connection instances. Using agent in a creative way could work there, but I need to think about it more.

I also tried to see if it could be enforced at the ConnectionPool layer, by creating a ConnectionPool class that, when Connection is an UndiciConnection, keeps track of PoolStats across all connections. The main catch there is that ConnectionPool.getConnection is a synchronous method, so there's no way to defer execution when there are no available sockets. Technically that is a transport implementation detail, so I could update all ConnectionPool classes to make getConnection async, but that could also break the contract for users who have written their own ConnectionPool classes, unless I allow the function to return either a value or a Promise.

So, I have some things to think about here. The work will continue!

JoshMock commented 2 months ago

I spent the day building a ConnectionPool class that is largely a wrapper around Undici's BalancedPool. It seems to have some potential, but will require a lot of greenfield development of new Connection and ConnectionPool classes. If we want to make BalancedPool available to all three of the ConnectionPool strategies (weighted, cloud, cluster), it will require a pretty significant refactor.

I need to move on to some other priorities for a bit, but this is something I want to revisit soon and see if there are other possible implementations I'm overlooking.

JoshMock commented 2 months ago

Undici has an open issue to support configuring max sockets. I'm going to see if I can make a contribution there to get that functionality included.