Open agj opened 7 months ago
Forgot to add that I'm immensely appreciative of the work done on Kysely. It's a great tool that lets us work with our database without fear. π
Hey π
Does this improve performance?
const demoQuery = (db: Kysely<DB>) =>
db
.selectFrom((eb) =>
eb
.selectFrom("my_table")
.where("my_table.id", "in", [1, 2, 3])
.selectAll()
.as("test")
)
.selectAll();
Hi! I'm away from my computer, as I'm on vacation right now, but as soon as I'm back I'll give it a try!
@igalklebanov It did improve it quite a bit! But it's still not greatβ¦
Files: 435
Lines: 119236
Identifiers: 112578
Symbols: 246727
Types: 7593
Instantiations: 1355207
Memory used: 212941K
I/O read: 0.05s
I/O write: 0.00s
Parse time: 0.69s
Bind time: 0.30s
Check time: 11.30s
Emit time: 0.00s
Total time: 12.29s
Hello there, and thank you for the amazing project, I introduced the typesafe query builder into our project which containes alot of tables. I wonder if Kysely is made for a scale of 100s and 100s of tables with 100s of complex queries.
Without Kyseley type checking is around 8 seconds, after starting generating the types of our db and only include ~100tables and written complex queries which use the expression builder to compose sub queries with joints. The query results are efficient and build for purpose And I start already plunching in the 60 seconds type checking. This makes the intellisense so slow and with delay that I'm not able to write more queries.
If this performance is odd I will go deeper into stats and post them here. Thank you
TypeScript version 5.1.6
I'd strongly consider splitting data access to several Kysely instances with their own limited view of the world, if possible.
I believe kysely-codegen
has a filtering mechanism, should come in handy.
Kysely is built around, for the most part, string literal type comparisons. The more unique column and table names in DB
interface, the slower type-checking gets and harder language server works.
thank you for your quick reply, I will give this a try.
would a split into multiple schemas on the same interface, differentiated by the prefix, alleviate the performance issue?
I'm not able to reproduce this. Any chance to get a full reproduction? A complete project I can run and profile?
Yeah, sure. Here's a Github gist with the relevant files, including an anonymized version of the database types.
A big issue on our project as well. Please let me know if there is anything I could assist with.
The issue here is basically due to the metric shit-ton of tables and columns and some O(N^X) type checking corner-case. The typescript type checker is so insanely complex that debugging this is difficult.
There probably is some way to make this case faster, but the only way to find it is blind trial and error. Or alternatively getting the help of one of the ~10 people on earth that actually understand how typescript internals work.
A big issue on our project as well. Please let me know if there is anything I could assist with.
You can try profiling the typescript type checker and figuring out which type causes the issue.
The issue is somehow related to selectFrom
overloads. If I replace all overloads with this single signature:
selectFrom<
TE extends
| TableExpression<DB, never>
| ReadonlyArray<TableExpression<DB, never>>,
>(
from: TE,
): TE extends ReadonlyArray<infer T>
? SelectQueryBuilder<From<DB, T>, FromTables<DB, never, T>, {}>
: TE extends keyof DB & string
? SelectQueryBuilder<DB, ExtractTableAlias<DB, TE>, {}>
: SelectQueryBuilder<From<DB, TE>, FromTables<DB, never, TE>, {}>
the type checking time goes from 12 seconds to 0.7 seconds. That type also passes all Kysely tests except this one:
async function testGenericSelect<T extends keyof Database>(
db: Kysely<Database>,
table: T,
) {
const r1 = await db.selectFrom(table).select('id').executeTakeFirstOrThrow()
expectAssignable<string | number>(r1.id)
}
which tests generic usage of selectFrom
which we don't really support anyway.
But we have the same issue with updateTable
and all other QueryCreator
methods that have been split into multiple overloads π
Fixed updateTable
and deleteFrom
the same way.
Now I get this when there are no type errors
And this when there are:
There's now a PR for this. @agj Would you be able to test the PR branch in your codebase?
@imwexpex Could you also build the PR branch and test if it fixes the peformance issues you're having?
@imwexpex Could you also build the PR branch and test if it fixes the peformance issues you're having?
- pull the branch
- npm install
- npm run build
- Point your project to the local build
Unfortunately, the fix did not have a big impact on my project
Latest release:
PR Branch:
Unfortunately, the fix did not have a big impact on my project
1.86 seconds is not a long time. Looking at the numbers, you don't actually have any performance issues. Especially given that the parsing time is also 1.5 seconds. Type checking 3000 files in 1.86 seconds is actually insanely fast.
Unfortunately, the fix did not have a big impact on my project
1.86 seconds is not a long time. Looking at the numbers, you don't actually have any performance issues. Especially given that the parsing time is also 1.5 seconds.
Agree with you, there is no issue with compilation time, but rather with WebStorm/VSCode Intellisense. For some reason, hard queries kill IDE suggestions/linting.
This was also mentioned by @kevin-johne
This makes the intellisense so slow and with delay that I'm not able to write more queries.
Well without any additional info, I won't be able to help you. I'm not seeing slow intellisense in vscode in any of our tests, including @agj's reproduction.
Intellisense uses the same type checker as the compiler.
Also, if the issue was in intellisense and not in the build, why did you send the build benchmark? Did you actually check the intellisense performance with the new branch?
Well @koskimas, not sure if this is good or bad news. The issue was for version 0.22.2, and I just confirmed that the issue persists in that version as well as in 0.22.3.
However, both 0.22.4 and the PR's version behave much better! In fact, indistinguishably so, both hover around 7 s for me, compared to the >60 s of the older versions. So, it looks like the changes in the PR don't do much when it comes to this particular issue, but something in v0.22.4 fixed the issue.
I'm very happy that the issue is under control, and I'm sorry that I wasn't able to identify in time that it was fixed in the latest release. π
@agj I didn't actually have any perf issues in your reproduction either before the fix when the types were correct, but once I mistyped something, I got the 12 second build time and 100s of thousands of instantations.
If you take 0.22.4 and test this
const demoQuery = (db: Kysely<DB>) =>
db
.selectFrom((eb) =>
eb
.selectFrom("my_table_LOOOOL")
// Removing this `where` makes a dramatic difference in compilation time.
.where((eb2) => eb2("my_table.id", "in", [1, 2, 3]))
.selectAll()
.as("test")
)
.selectAll();
versus the branch, do you see a difference?
Also, if the issue was in intellisense and not in the build, why did you send the build benchmark? Did you actually check the intellisense performance with the new branch?
Sure, I checked. 3+ sec lag for any suggestion and the RAM usage goes up to 3GB on the opened file.
I've tried to reproduce the same query in your test file, and it looks like everything works smoothly. So probably an issue somewhere else...
Query example, to be clear:
db
.selectFrom((eb) =>
eb
.selectFrom('my_table')
.where('my_table.id', 'in', [1, 2, 3])
.where('my_table.col_164b7896ec8e770207febe0812c5f052', 'is', 12)
.select((eb) => [
jsonObjectFrom(
eb
.selectFrom('table_1474a7e0348b1ca363ee4a2a5dc4a1ec')
.selectAll('table_1474a7e0348b1ca363ee4a2a5dc4a1ec')
.whereRef(
'my_table.id',
'=',
'table_1474a7e0348b1ca363ee4a2a5dc4a1ec.id',
),
).as('my_table'),
jsonArrayFrom(
eb
.selectFrom('table_0b5ac72e03509e06683edcba4b3887ab')
.selectAll('table_0b5ac72e03509e06683edcba4b3887ab')
.where(({ eb: eb1, ref, selectFrom }) =>
eb1(
ref('table_0b5ac72e03509e06683edcba4b3887ab.id'),
'in',
selectFrom('table_da9bc7793b7e3784c03d55128145b7e3')
.where('id', '=', 1)
.select('id'),
),
)
.whereRef(
'my_table.id',
'=',
'table_0b5ac72e03509e06683edcba4b3887ab.id',
)
.select((eb1) => [
jsonObjectFrom(
eb1
.selectFrom('table_ede804aecaaa0ad1a5afffb9b2a0d927')
.whereRef(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.id',
'=',
'table_0b5ac72e03509e06683edcba4b3887ab.col_1b6b0183f1a79f6d53cc5f8bd5f315d7',
)
.where(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.col_80ae6f72f9c170f9e1bcae48c1677802',
'!=',
12,
)
.selectAll('table_ede804aecaaa0ad1a5afffb9b2a0d927')
.select((eb2) => [
jsonObjectFrom(
eb2
.selectFrom('table_81859a5a7cf340e00333e133ee83c6a3')
.select([
'col_0191c0aa0d39de591b5236b304496123',
'col_08de056b6d3139a95d0aaf99f8e3c52e',
'col_195367e97caee2b9ed726514b7a38efc',
])
.whereRef(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.col_a7f4797b61012da3a9c1a8049c8a4974',
'=',
'table_81859a5a7cf340e00333e133ee83c6a3.col_aa6b7c7a9c7a177e3f1ba452783eb63b',
),
).as(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.col_aa6b7c7a9c7a177e3f1ba452783eb63b',
),
jsonObjectFrom(
eb2
.selectFrom('table_1084015020d8bed5c99852780cd060b5')
.select([
'table_1084015020d8bed5c99852780cd060b5.col_665cebec3ce1380ce4e3c90b1bedcbf5',
])
.whereRef(
'table_1084015020d8bed5c99852780cd060b5.id',
'=',
'table_1084015020d8bed5c99852780cd060b5.col_53ca581ab7a5aef5b659bc27c81d5d51',
),
).as(
'table_1084015020d8bed5c99852780cd060b5.col_53ca581ab7a5aef5b659bc27c81d5d51',
),
]),
).as('table_ede804aecaaa0ad1a5afffb9b2a0d927.data'),
jsonObjectFrom(
eb1
.selectFrom('table_736d84cc80e257dc78371b1eb0fb6b20')
.selectAll('table_736d84cc80e257dc78371b1eb0fb6b20')
.whereRef(
'table_736d84cc80e257dc78371b1eb0fb6b20.id',
'=',
'table_0b5ac72e03509e06683edcba4b3887ab.id',
),
).as('table_736d84cc80e257dc78371b1eb0fb6b20'),
])
.orderBy('col_15af44699327f9a6bb2f12da27fdf0ce', 'desc'),
).as('data'),
])
.as('test'),
)
.selectAll()
@koskimas You're right! I could reproduce the issue when there's a mistyped table name. Before the PR it's ~28 s, with the PR it's ~7 s. π
@imwexpex There seems to be something in jsonArrayFrom
and jsonObjectFrom
types combined with inline nested queries + huge database that kills typescript language server. I'll take a look at those types.
Having said that, you can already avoid the slowness and arguably make your queries much more readable by splitting them to helper functions like this:
const result = await db
.selectFrom((eb) =>
eb
.selectFrom('my_table')
.where('my_table.id', 'in', [1, 2, 3])
.where('my_table.col_164b7896ec8e770207febe0812c5f052', 'is', 12)
.select((eb) => [
relation1(eb.ref('my_table.id')).as('my_table'),
relation2(eb.ref('my_table.id')).as('data'),
])
.as('test'),
)
.selectAll()
.execute()
function relation1(foreignKey: Expression<number>) {
return jsonObjectFrom(
db
.selectFrom('table_1474a7e0348b1ca363ee4a2a5dc4a1ec')
.selectAll('table_1474a7e0348b1ca363ee4a2a5dc4a1ec')
.where('table_1474a7e0348b1ca363ee4a2a5dc4a1ec.id', '=', foreignKey),
)
}
function relation2(foreignKey: Expression<number>) {
return jsonArrayFrom(
db
.selectFrom('table_0b5ac72e03509e06683edcba4b3887ab')
.selectAll('table_0b5ac72e03509e06683edcba4b3887ab')
.where(({ eb, ref, selectFrom }) =>
eb(
ref('table_0b5ac72e03509e06683edcba4b3887ab.id'),
'in',
selectFrom('table_da9bc7793b7e3784c03d55128145b7e3')
.where('id', '=', 1)
.select('id'),
),
)
.where('table_0b5ac72e03509e06683edcba4b3887ab.id', '=', foreignKey)
.select((eb1) => [
relation3(
eb1.ref(
'table_0b5ac72e03509e06683edcba4b3887ab.col_1b6b0183f1a79f6d53cc5f8bd5f315d7',
),
).as('data'),
relation4(eb1.ref('table_0b5ac72e03509e06683edcba4b3887ab.id')).as(
'data2',
),
])
.orderBy('col_15af44699327f9a6bb2f12da27fdf0ce', 'desc'),
)
}
function relation3(foreignKey: Expression<string | null>) {
return jsonObjectFrom(
db
.selectFrom('table_ede804aecaaa0ad1a5afffb9b2a0d927')
.where('table_ede804aecaaa0ad1a5afffb9b2a0d927.id', '=', foreignKey)
.where(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.col_80ae6f72f9c170f9e1bcae48c1677802',
'!=',
12,
)
.selectAll('table_ede804aecaaa0ad1a5afffb9b2a0d927')
.select((eb) => [
jsonObjectFrom(
eb
.selectFrom('table_81859a5a7cf340e00333e133ee83c6a3')
.select([
'col_2f76db193eac6ad0f152563313673ac9',
'col_4d742b2f247bec99b41a60acbebc149a',
'col_6137cde4893c59f76f005a8123d8e8e6',
])
.whereRef(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.col_a7f4797b61012da3a9c1a8049c8a4974',
'=',
'table_81859a5a7cf340e00333e133ee83c6a3.col_aa6b7c7a9c7a177e3f1ba452783eb63b',
),
).as(
'table_ede804aecaaa0ad1a5afffb9b2a0d927.col_aa6b7c7a9c7a177e3f1ba452783eb63b',
),
jsonObjectFrom(
eb
.selectFrom('table_1084015020d8bed5c99852780cd060b5')
.select([
'table_1084015020d8bed5c99852780cd060b5.col_665cebec3ce1380ce4e3c90b1bedcbf5',
])
.whereRef(
'table_1084015020d8bed5c99852780cd060b5.id',
'=',
'table_1084015020d8bed5c99852780cd060b5.col_53ca581ab7a5aef5b659bc27c81d5d51',
),
).as(
'table_1084015020d8bed5c99852780cd060b5.col_53ca581ab7a5aef5b659bc27c81d5d51',
),
]),
)
}
function relation4(foreginKey: Expression<number>) {
return jsonObjectFrom(
db
.selectFrom('table_736d84cc80e257dc78371b1eb0fb6b20')
.selectAll('table_736d84cc80e257dc78371b1eb0fb6b20')
.where('table_736d84cc80e257dc78371b1eb0fb6b20.id', '=', foreginKey),
)
}
I have a query pattern that degrades tsc type checking performance to an outsized degree. It involves a nested query in
selectFrom
, and a simplewhere
clause.This is my most minimal reproduction:
With the
where
line indicated by the comment, runningtsc --diagnostics --noEmit
takes about 30 seconds, whereas after removing it it takes about a second.TypeScript diagnostics with
where
:TypeScript diagnostics WITHOUT
where
:My DB has a large number of tables, so I'm sure that's part of the issue, but I never had this problem until I added this particular pathological query.
I'm using:
kysely
v0.27.2typescript
v5.3.3Edit: Here's a Github gist with all of the files needed to reproduce this issue.