ClickHouse / clickhouse-js

Official JS client for ClickHouse DB
https://clickhouse.com
Apache License 2.0
213 stars 26 forks source link

Request for Connection Pooling Feature when creating several clients. #278

Closed yosefrev closed 4 months ago

yosefrev commented 4 months ago

Hey,

We are using the Clickhouse node package to create queries from an app to our DB. We have many users who frequently query the database, Each user has a unique Clickhouse user, requiring the creation of a new client object for each query. This results in significant TCP connection overhead.

We would love the option to have a connection pool to prevent the TCP overhead associated with creating new client objects. For example, adding the option to send an agent to the client, similar to Axios, to get the connection from a pool instead of creating a new TCP connection. This would allow us to reuse existing connections, optimizing performance and reducing latency.

Thank you very much for looking into this request, looking forward to hearing back from you.

Best regards, Yosef, TripleWhale.

slvrtrn commented 4 months ago

There is already an internal pool of TCP connections (sockets), which is utilized when KeepAlive is enabled (and it is enabled by default). The number of active sockets is 10 by default, and an already open socket will be reused until it idles for too long, and then it is considered expired and removed from the pool. Related docs: https://clickhouse.com/docs/en/integrations/language-clients/javascript#connection-pool-nodejs-only + https://clickhouse.com/docs/en/integrations/language-clients/javascript#keep-alive-configuration-nodejs-only

However, currently, the credentials are tied to the client instance, and the authentication header is immutable internally (the relevant method source is here).

Considering the description of your use case, you would like to override the user credentials for each request. Is this correct?

dermasmid commented 4 months ago

yes, that is correct. i think it would be good if the sdk would allow to set the http.Agent since the agent has built in tcp pooling that would solve our issue.

theres other usecases that will benefit from being able to set the agent, like if we want to customize the lookup for dns caching

slvrtrn commented 4 months ago

i think it would be good if the sdk would allow to set the http.Agent since the agent has built in tcp pooling that would solve our issue.

As I mentioned before, internal TCP connection pooling is already there. Due to possible complications with socket management, I'd prefer not to allow overriding the agent defined in the Node.js base connection.

Regarding lookup, we can just add this setting to the client configuration, which will then be passed to the HTTP(s) agent object.

dermasmid commented 4 months ago

why did you implement pooling instead of relying on the one that ships with the Agent ?

slvrtrn commented 4 months ago

Re: the user credentials override.

yes, that is correct.

So, just to double check: if we add username + password to the BaseQueryParams:

export interface BaseQueryParams {
  // <...>

  /** When defined, it will override the username from the {@link BaseClickHouseClientConfigOptions.username} setting.
   *  @default undefined */
  username?: string
  /** When defined, it will override the password from the {@link BaseClickHouseClientConfigOptions.password} setting.
   *  @default undefined */
  password?: string
}

BaseQueryParams is the base interface for all method parameters (query, exec, command, insert).

So, if username/password are defined in the query params, we can override the auth header; will this be sufficient for the issue in the OP?

dermasmid commented 4 months ago

yes

slvrtrn commented 4 months ago

Added in 1.1.0.

dermasmid commented 4 months ago

thanks @slvrtrn!

i'd still like to know why the lib needs to implement connection pooling ontop of the nodejs built-in one.

i have use-cases in which i'd like to set the http agent, which will decouple the tcp connection from a client and its settings.

most http based nodejs packages expose the http Agent as a param when creating a client.

slvrtrn commented 4 months ago

i'd still like to know why the lib needs to implement connection pooling ontop of the nodejs built-in one.

It doesn't; it's just the HTTP(s) agent that handles the pooling. The client adds a few things on top of that, such as gracefully shutting down the idling keep-alive sockets since it does not work well out of the box in Node.js.

You can have a look at the node_base_connection.ts sources and the overrides in the https connection implementation.

i have use-cases in which i'd like to set the http agent, which will decouple the tcp connection from a client and its settings.

If you have a specific option in mind (like that lookup handler you mentioned earlier), please feel free to open a feature request.

most http based nodejs packages expose the http Agent as a param when creating a client.

After the internal discussion about this, we still think that providing an option to override the agent itself will cause us more issues in the long run.

CC @mshustov

dermasmid commented 4 months ago

Thanks for the reply @slvrtrn maxFreeSockets on the agent level should deal with idling sockets, while its not a ttl, theres still the ttl on the server side which will eventually close that socket if its unused.

btw the current implementation of onSocket will not work once a socket has been used more then once it will go into Reusing socket.... but not create a new idle_timeout_handle

what im trying to say is that it might be worth giving up on this and instead allow users to se the Agent

dermasmid commented 4 months ago

also it seems like a combination of

socket.setTimeout
socket.on('timeout')

can achieve the same behaviour for timeouts like you wanted [docs](https://nodejs.org/api/net.html#socketsetkeepaliveenable-initialdelay:~:text=socket.setTimeout(3000)%3B%0Asocket.on(%27timeout%27%2C%20()%20%3D%3E%20%7B%0A%20%20console.log(%27socket%20timeout%27)%3B%0A%20%20socket.end()%3B%0A%7D)%3B)

slvrtrn commented 3 months ago

As we had a few other requests regarding this, custom HTTP(s) agent is now supported as of 1.2.0; see also the relevant docs with examples.

dermasmid commented 3 months ago

Thanks!!! This is very helpful