Closed igalklebanov closed 3 weeks ago
for the record if anyone has the issue. here is how i got around this
query
.leftJoin(
// @ts-expect-error this is needed because typescript complains but it does work
({ parens }) => parens(sql`("table1"
INNER JOIN "table2" ON "table2"."id" = "table1"."table2_id")`),
"table1.id",
"previousTableFoundInQuery.table1_id"
)
// @ts-expect-error this is required because the raw causes typescript not to infer the type
.orderBy("table.id", `asc NULLS LAST`);
Wouldn't this be equivalent to
select *
from "t1"
left join (
select "t2".*
from "t2"
inner join "t3" on "t2"."c" = "t3"."c"
as "t2"
) on "t1"."c" = "t2"."c"
?
The query optimizer will most likely turn that into exactly the same query plan as the original example. Only a really dumb optimizer would materialize the inner query in this case.
This query you can already build using kysely:
db.selectFrom("t1")
.leftJoin(
db.selectFrom("t2").innerJoin("t2", "t2.c", "t3.c").selectAll("t2").as("t2"),
"t1.c",
"t2.c",
)
.selectAll()
This is the very first time during my 18 year career that I see the syntax in the original message. Do we really want to rewrite our join builder and logic to support this case?
I wrote this little test. The only difference in the syntaxes is that in the subquery case, t3
columns are not selected in the outer query. You need to explicitly select them from the inner one.
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
DROP TABLE IF EXISTS t3;
CREATE TABLE t1 (id SERIAL PRIMARY KEY, c int);
CREATE TABLE t2 (id SERIAL PRIMARY KEY, c int);
CREATE TABLE t3 (id SERIAL PRIMARY KEY, c int);
CREATE INDEX t1_c_idx ON t1(c);
CREATE INDEX t2_c_idx ON t2(c);
CREATE INDEX t3_c_idx ON t3(c);
INSERT INTO t1 (c)
SELECT floor(random() * 100000000)
FROM generate_series(0, 1000000);
INSERT INTO t2 (c)
SELECT floor(random() * 100000000)
FROM generate_series(0, 1000000);
INSERT INTO t3 (c)
SELECT floor(random() * 100000000)
FROM generate_series(0, 1000000);
VACUUM ANALYZE;
EXPLAIN ANALYZE SELECT *
FROM t1
LEFT JOIN (t2 INNER JOIN t3 ON t2.c = t3.c) ON t1.c = t2.c
;
EXPLAIN ANALYZE SELECT *
FROM t1
LEFT JOIN (
SELECT t2.* FROM t2 INNER JOIN t3 ON t2.c = t3.c
) AS t2 ON t1.c = t2.c
;
At least with this dummy dataset, the query plans are 100% equivalent.
Agree with @koskimas, not much added value in adding this - it'll be complex and the DX is not guaranteed to be great.
Closing this for now.
Hey 👋
Kysely currently produces
join
s without parenthesis, and engines process this from left to right. This can produce unwanted results in some cases. e.g.left join
followed by aninner join
, in which you want to keep t1's rows regardless ofinner join
between t2 and t3.You can wrap joins in parenthesis to group joins and break out of the regular processing order:
SQLite might require aliasing the parenthesis before the
on
clause.