kysely-org / kysely

A type-safe typescript SQL query builder
https://kysely.dev
MIT License
10.44k stars 266 forks source link

[proposal] Split PostgresDialectConfig into separate Client and Pool configs #473

Closed jtlapp closed 1 year ago

jtlapp commented 1 year ago

I'm about to embark on a project that may belong in Kysely proper.

The node-postgres package (pg) has quite a bit of code for pooling client connections. Serverless solutions using PostgresDialect run this code on each HTTP request, in addition to running the code for creating each client. Serverless connections only require the client, and pg allows us to issue queries against a provided client, but the Kysely PostgresDialect does not.

I don't know the exact overhead of pooling on serverless functions, but I do know that memory is used and clock cycles are spent unnecessarily. Of course, this has negligible effect on query performance, as queries take considerably longer than the cost of setting up and tearing down pooling. But resources are being used, and servers can use resources for purposes other than queries. And it's always ideal to minimize memory consumed and cycles of garbage collection. (And on serverless we pay for resources used.)

After all, the pg package itself allows one to create clients without also creating pools, for some reason.

I could create a third party package that uses pg's Client class instead of Pool, but I'm not doing anything more than exposing existing pg functionality. It seems to me that the feature might therefore belong in Kysely proper.

I propose being able to configure PostgresDialect with either PostgresDialectPoolConfig or PostgresDialectClientConfig. PostgresDialectClientConfig would be as follows (sans onCreateConnection):

export interface PostgresDialectClientConfig {
  client: PostgresClient | (() => Promise<PostgresClient>)
  cursor?: PostgresCursorConstructor
}

For backwards compatibility, I'd redefine PostgresDialectConfig so that pool and client are both optional. However, PostgresDialect would take PostgresDialectPoolConfig | PostgresDialectClientConfig.

In addition to any improvement to resource usage, the feature would allow people to provide an existing pg Client to Kysely, as they already can do with an existing Pool.

What say ye? I'm happy going either way.

koskimas commented 1 year ago

If this can be implemented without creating a whole another PostgresDriver, I think this might be a good addition. Maintaining a whole another driver just to get speed improvement that nobody has measured is not worth it.

I think we might be able to support this with minimal changes to the dialect. I'll take a quick look and get back to you.

koskimas commented 1 year ago

Actually seems like this would require a separate driver. Unless there's a good benchmark showing the speedup would be significant in some real use case, I think this is better left out of the core. You can always use a pool with min: 1, max: 1.

jtlapp commented 1 year ago

Thanks for the response. I implemented it in a separate repo and found myself copying a lot of your code and changing very little. So I decided to see what it would take to implement it directly in Kysely. It was straightforward, mostly a shuffling of the client and config interfaces:

Here are the diffs (very little change to behavioral code)

There are only two things that are less than ideal:

  1. The test suite is designed to run once per dialect, whereas we'd ideally run it once with postgres Pool and once with postgress Client.
  2. The driver property pool is mutually exclusive with singleClient and singleConnection. Normally this would suggest abstracting the two into a single object, but I wanted to minimize the changes.

Some of my coding decisions were made around minimizing diffs shown. I feel like I'm walking on eggshells.

I'd rather not copy your code, so it doesn't have to be maintained in two places. If I do have to implement my repo, I'll probably rip out cursor support to reduce my exposure. But if we implement it in Kysely, I'd like a suggestion for how to fit its testing into the test architecture.

jtlapp commented 1 year ago

I decided to hack the test suite to use Client instead of Pool, just to debug the code. Turns out I forgot to call Client.connect(), and I misunderstood the purpose of acquireConnection.

The entire Kysely test suite works on the following code, configured for Client instead of Pool, except for the 3 test cases that run queries in parallel:

Here are the updated diffs

These are the 3 failing tests. I confirmed that all are running queries in parallel:

[node] [postgres]   1) postgres: migration
[node] [postgres]        migrateToLatest
[node] [postgres]          should work correctly when run in parallel:

[node] [postgres]   2) postgres: migration
[node] [postgres]        custom migration tables in a custom schema
[node] [postgres]          should create custom migration tables in custom schema:

[node] [postgres]   3) postgres: transaction
[node] [postgres]        should run multiple transactions in parallel:

I'm still not sure of the best way to test it without the Client-for-Pool hack, though.

jtlapp commented 1 year ago

Actually, I understood acquireConnection properly the first time. My second revision got most tests to work by emulating a pool, allowing the same connection to be returned multiple times. If I detect this and throw an exception, many more tests fail. Examining these tests, I see that many assume it's possible to get at least two simultaneous connections.

So the existing test suite is not suitable for testing the Client config.

@koskimas, I'm starting to see why you think this is best as second driver. I'll move it to a separate repo.

Thanks for being patient with me as I figured this out!